Research On Action Recognition Technology Based On Video

Posted on:2021-05-26

Degree:Master

Type:Thesis

Country:China

Candidate:Z Y Zhao

Full Text:PDF

GTID:2428330605456101

Subject:Instrument Science and Technology

Abstract/Summary:

PDF Full Text Request

In recent years,with the rapid development of artificial intelligence,deep learning plays an important role in the field of video action recognition.The use of convo lutional neural networks to extract the spatial features of images has become the mainstream method.However,the complex environmental background,lighting conditions and other action-irrelevant visual information in the video frame bring a lot of redundancy and noise to the action spatial features,which affects the accuracy of action recognition.Secondly,different types of action videos may have similar contexts in temporal,which makes the network model predict errors.This paper designs a recurrent region attention mechanism and a video frame attention mechanism for video action recognition to respectively solve the problem of the redundancy and noise in the action spatial features and the interference problem caused by the similar context between the actions in the temporal.Secondly,based on the spatial and temporal characteristics of video,this paper designs a deep spatio-temporal network model that can be trained end-to-end,including convolutional neural network,recurrent region attention mechanism,video frame attention mechanism,and long short-term memory network.Among them,the convolutional neural network is used as a feature extractor to extract the spatial features of the video frame;the recurrent region attention cell in the recurrent region attention mechanism captures the regional visual information related to the action in the spatial feature,and according to the temporal characteristics of the video,the recurrent regional attention cell iterates according to the temporal sequence of the video,so that the recurrent region attention mechanism can effectively capture the action-relevant regional visual information in the spatial features of each frame of the action video sequence;the video frame attention mechanism highlights the more important video frames in the whole video sequence to reduce the interference caused by the similar context between the heterogeneous action video sequences;the long short-term memory network learn the before and after dependencies between the video frames.The cross-entropy loss function is used to update the network model parameters,so that the network model can better distinguish the action categories.On this basis,this paper makes full use of the appearance information and motion information of action,and constructs the RGB modality network model and the optical flow modality network model respectively.Finally,the probability fusion of the output of the two modalities network model is carried out to enhance the accuracy of action recognition.The experimental results on two video action recognition public datasets show that the recurrent region attention mechanism and video frame attention mechanism designed in this paper reduce the problem of redundancy and noise in the spatial feature,and the interference problem caused by the similar context between the actions in temporal,the effectiveness of the recurrent region attention mechanism and the video frame attention mechanism is verified,and the recognition accuracy of the network model is improved.

Keywords/Search Tags:

Video action recognition, Recurrent region attention mechanism, Video frame attention mechanism, CNN, LSTM

PDF Full Text Request

Related items

1	Attention Mechanism Based Deep Network For Human Action Recognition In Video
2	Research And Implementation Of Video Action Recognition Based On Long-Time Feature Fusion And Attention Mechanism
3	Research On Video Human Action Recognition Based On Pose Sequence
4	Human Action Recognition Method Based On Bi-LSTM And Attention Mechanism
5	Research On Video Action Recognition Based On Deep Learning
6	Research On Dynamic Video Summarization Technology Via Attention Mechanism
7	Research And Implementation Of Video Action Recognition Based On Feature Fusion And Hybrid Attention Mechanism
8	Human Action Recognition Via Dual Spatio-temporal Network Flow And Attention Mechanism Fusion
9	Research On Human Action Recognition Method Based On Deep Learning
10	Human Action Recognition Based On Spatio-temporal Network And Attention Mechanism