Action Recognition Based On Spatiotemporal Attention Depth Model

Posted on:2021-03-24

Degree:Master

Type:Thesis

Country:China

Candidate:Z Zhang

Full Text:PDF

GTID:2428330614460441

Subject:Computer technology

Abstract/Summary:

PDF Full Text Request

Aciton recognition in video refers to the process of recognizing specific action categories from video.It has a wide range of applications in the fields of video surveillance,video retrieval,and human-computer interaction.However,most methods are still limited by the complex background in the videoresulting in the action recognition task still suffering huge difficulties and challenges.For this problem,we combine the attention mechanism with the basic action recognition model,search the deep learning network structure,suppress the interference of background information,and improve the recognition ability of the action recognition model in complex interference videos.The contributions of this article are as follows:(1)The subject summarizes the research status of action recognition on traditional features,CNN models,RNN model methods,and focuses on the deep learning basic models of Resnet and LSTM.Finally,the content of the attention mechanism is briefly introduced.This lays a theoretical foundation for the deep learning model that integrates attention in this subject.(2)In view of the existence of redundant frames in the videos,which reduces the reliability of action expression,this paper proposes a new temporal attention LSTM action recognition model based on sequential verification.The model is designed to SVM to discriminate the sequence relationship between video frames,and learn tempoal attention of each frame by time pooling the sequence relationship,so as to obtain enhanced action expression,and suppress the low-quality redundant frames.After obtaining the enhanced features,LSTM is used to learn the time-dependent relationship between the action features.The experiment was validated on two recognized datasets,UCF101 and HMDB51,which can achieve reliable action recognition.(3)For the spatial background information on a single video frame,we add a spatial attention module in the preprocessing stage of the network structure,and propose a action recognition method based on the spatio-temporal attention two-stream network.This model is designed with a convolution structure that combines average pooling and maximum pooling to achieve spatial attention and is used to suppress the spatial background.The experiment is verified on two recognized datasets,UCF101 and HMDB51,which can further improve the action recognition performance.

Keywords/Search Tags:

Activity Recognition, Spatial-Temporal Attention, Two-Stream Network, Convolutional Neural Network, Long Short-Term Memory Network

PDF Full Text Request

Related items

1	Action Proposal And Activity Recognition Based On Attention LSTM
2	Chinese Sign Language Recognition Based On Convolutional Network And Long Short Term Memory Network
3	Research On Human Behavior Recognition Method Based On Graph Convolutional Networks
4	Neonatal Pain Expression Recognition Based On Spatial-temporal Feature Deep Learning
5	A Comparative Study Of Deep Neural Network For Mobile Phone Sensor Activity Recognition
6	Group Activity Recognition Algorithm Research Based On Attention Mechanism And Deep Learning Network
7	Long Short Term Memory Recurrent Neural Network Application To Handwritten Recognition
8	Acceleration Gesture Recognition Based On Long-short Term Memory Network
9	Research On Network Traffic Prediction Based On Deep Learning Method
10	Regional Short-term Load Forecasting Model Combining Attention Mechanism And Deep Neural Network