Research On Human Action Recognition Based On Phase Spectrum Motion Saliency Detection And Self-attention Mechanism

Posted on:2024-03-17

Degree:Master

Type:Thesis

Country:China

Candidate:G W Xu

Full Text:PDF

GTID:2568307139996349

Subject:Master of Electronic Information (Professional Degree)

Abstract/Summary:

PDF Full Text Request

Human action recognition is an active research area in computer vision,aiming to recognize action categories from videos.Human actions are activities that occur over a continuous period of time.To improve the accuracy of action recognition models,it is necessary to fully consider the temporal features of actions.Most existing action recognition methods rely on optical flow to extract temporal features of actions.However,optical flow is sensitive to slight motion variations between frames,which are often irrelevant background noise such as swaying leaves and audience movements.In addition,traditional temporal global average pooling layers in convolutional neural networks fail to capture the order and importance of temporal features,which may be key features for distinguishing actions.To address these issues,this study proposes the following:(1)To enhance the model’s ability to extract temporal features and filter out background noise,this study proposes an action recognition method based on phase spectrum motion saliency detection.First,a two-stream model including both spatial and temporal paths is constructed using the Res Ne Xt network to enhance the model’s ability to extract temporal features.Then,the proposed method uses a phase spectrum-based motion saliency detection method to extract salient features of actions,which are stacked with video frames and fed into the spatial path for feature extraction to enhance the model’s ability to filter out background noise.Experimental studies conducted on the UCF101 and HMDB51 datasets demonstrate that extracting salient features of actions can effectively improve the model’s recognition accuracy.(2)To more fully extract the temporal and spatial features of actions,this study proposes a post-temporal modeling action recognition method based on self-attention mechanisms.To effectively differentiate between action categories,certain temporal features may be more important than others,or the order of temporal information may be more beneficial for extracting temporal features than simple average temporal information.However,temporal global average pooling layers ignore these characteristics,resulting in incomplete utilization of temporal features.To address this issue,the proposed method employs a self-attention mechanism as a replacement for the conventional temporal global average pooling layer,with the aim of more effectively extracting temporal and spatial features of actions.Experimental results demonstrate the effectiveness of this method,achieving competitive recognition accuracy compared to advanced models on the UCF101 and HMDB51 datasets.

Keywords/Search Tags:

deep learning, action recognition, motion saliency features, self-attention mechanism

PDF Full Text Request

Related items

1	Research On Action Recognition Algotithm Based On Deep Learning
2	Research On Human Action Recognition Based On Skeleton Feature
3	Research On Human Action Recognition Method Based On Deep Learning
4	Research On Action Recognition Based On Deep Network Learning Of Spatio-temporal Features
5	Studies On Action Recognition In Video Based On Deep Learning
6	Human Action Recognition Via Dual Spatio-temporal Network Flow And Attention Mechanism Fusion
7	Research On Human Action Recognition Based On Deep Learning
8	Research On Human Action Recognition Method Integrating Visual Attention Mechanism And Deep Learning
9	Research On Action Recognition Algorithm Based On Spatiotemporal Modeling And Its Application
10	Action Recognition Based On Two Stream Spatial-Temporal Attention Network