Research And Implementation Of Video Action Recognition Based On Feature Fusion And Hybrid Attention Mechanism

Posted on:2022-02-02

Degree:Master

Type:Thesis

Country:China

Candidate:B Y Liu

Full Text:PDF

GTID:2518306746481964

Subject:Software engineering

Abstract/Summary:

PDF Full Text Request

With the development of Internet and computer technology,a large amount of data needs to be processed in a timely and effective manner in many fields such as short video entertainment,urban security,accident early warning and fire protection.Moreover,the amount of data is still growing at a high speed,then along comes people's urgent demand for video understanding and analysis technology.In recent years,thanks to the application of deep learning technology in the field of video understanding and analysis,this field has been promoted rapidly.As a key technology in the field of video analysis,action recognition further promotes the rapid development of this field.Due to the large volume of video data,how to achieve high-precision recognition at low computational cost is a huge challenge in the current video action recognition field.In video action recognition tasks,deep neural network usually adopts high-level features to predict and classify,but with the increase of network depth,the resolution of feature graph decreases,so it is difficult to make accurate judgment for some subtle actions.In order to solve the problem of low resolution of high-level features and weak semantic information of low-level features,we fuse multi-scale features in the spatial dimension.In addition,the speed of action is also an important basis to judge the category of action,which requires further fusion of features in the temporal dimension.In this thesis,the spatio-temporal multi-scale feature fusion is adopted to effectively improve the accuracy of subtle and speed sensitive motion recognition.Similar to the way people judge actions,attention mechanism helps the network to extract key information of actions efficiently by learning various characteristics.However,the existing attention methods usually only focus on a single type of features.Therefore,on the basis of summarizing existing attention mechanisms,this thesis proposes a hybrid attention module,which further improves the recognition accuracy by screening and fusing spatio-temporal,channel and motion characteristics.Among them,the spatial and temporal attention network is used to characterize the spatial and temporal features of actions;channel attention is used to enhance the interdependence of channels in the time domain;motion attention is used to model the temporal difference of action feature levels in two adjacent frames.This thesis combines the above two parts of research and proposes a video action recognition method based on feature fusion and hybrid attention mechanism.The hybrid attention mechanism is embedded into the EfficientNet network framework to improve the feature screening ability and the multi-scale feature fusion module is introduced to enhance the representation ability of each level of features.In this paper,a large number of experiments are performed on Ego Gesture,Something-Something V2,and Mini-kinetic datasets.The experimental results show that the proposed method can effectively improve the recognition accuracy of subtle and speed-sensitive actions,and it can better handle the video action recognition task in complex scenes.

Keywords/Search Tags:

Video Understanding, Action Recognition, Feature Fusion, Mixed Attention

PDF Full Text Request

Related items

1	Research And Implementation Of Video Action Recognition Based On Long-Time Feature Fusion And Attention Mechanism
2	Analyzing And Understanding Human Actions In Videos
3	Deep Feature Fusion And Attention Models For Video Action Recognition
4	Research And Implementation Of Action Recognition Based On Deep Learning
5	Research On Spatiotemporal Information Fusion And Attention Enhancement Based Human Action Recognition
6	Human Action Recognition Based On Attention Mechanism And Multi-Modality Feature Fusion
7	Research On Image-based Action Recognition Based On Context And Feature Fusion
8	Action Recognition Based On Interactions
9	Fine-Grained Representations And Applications Of Human Action In Video Understanding
10	Feature Detection And Action Recognition Of Moving Human Body In Video Based On Improved HOG