Font Size: a A A

Human Action Recognition Method Based On Bi-LSTM And Attention Mechanism

Posted on:2020-02-16Degree:MasterType:Thesis
Country:ChinaCandidate:S ZhangFull Text:PDF
GTID:2518306353964469Subject:Control Engineering
Abstract/Summary:PDF Full Text Request
No matter from commercial aspect or social perspective,human behavior recognition technology plays a critical role in intelligent monitoring,human-computer interaction,video retrieval,etc.Under the condition of high complexity of human body motion,external background interference,camera disturbance,etc.,it is a challenging research content to improve the accuracy of human behavior recognition task.Therefore,the research on human behavior recognition algorithm gains important practical significance.After thoroughly studying the related technologies on human behavior recognition,following jobs were done:First of all,this thesis constructs a feature extraction method that divides the video into several segments in which 3D convolutional neural network is applied to extract temporal and spatial information features simultaneously.Considering that the traditional 3D convolutional neural network input continuous frame image is fixed,and it can not fully characterize the whole video motion feature,so the thesis first divides the video into several segments,then performs 3D convolutional neural network feature extraction.And finally tests it with UCF101 as the test set.It turns out that the accuracy of the segmented 3D convolution network is 82.7%,and the accuracy of the two-stream network without the pre-training model is 81.6%.Experiments show that the segmented 3D convolutional network comes up in this thesis is better than 2D convolutional network in video feature extraction.Secondly,a method based on Long Short Term Memory(LSTM)to learn motion feature context is discussed.The action temporal information is affected by the context.Two LSTM networks are used in parallel to process the video features from front to back and back to forward,so that the video features of the context can be processed.Then the program verified on the UCF101 dataset that the model has better performance than the one-way LSTM in human behavior recognition tasks.Finally,a scheme for extracting significant features based on attention mechanism is proposed.Because different regions of the image have different importance,this thesis uses the attention mechanism to assign different weight coefficients to each pixel of the feature image,which makes it easier for the network to learn the region with larger weight.The results of each video are fused using a self-attention mechanism to assign weight coefficients for each result sequence through its own characteristic sequence distribution.The system compares with several mainstream human behavior recognition methods on the UCF101 test set:the accuracy of two-stream network using SVM is 88.0%,and the accuracy of 2DCNN+LSTM network is 88.6%,and C3D+Bi-LSTM+Attention model in this thesis is 90.7%.The results show that the proposed system is better.
Keywords/Search Tags:Human action recognition, 3D convolutional neural network, LSTM, attention mechanism
PDF Full Text Request
Related items