The Design And Realization Of Spatial-Temporal Feature Extraction And Recognition Algorithm For Human Action Analysis

Posted on:2019-04-20

Degree:Master

Type:Thesis

Country:China

Candidate:W Zhou

Full Text:PDF

GTID:2428330545952200

Subject:Electronic and communication engineering

Abstract/Summary:

PDF Full Text Request

Action recognition is a hot research field in recent years.Understanding each other's body language can effectively help people comprehend the information they want to express.Related research has important theoretical significance and wide application value.In video human action characterization,there are two keys and complementary features:spacial feature and temporal feature.The performance of the recognition system depends on the ability to extract and use the information from video data largely.In order to make full use of intra-frame spatial features and inter-frame timing relationships,this paper proposes three kinds of action recognition algorithms.The main work is summarized as follows:(1)The effectiveness of the features proposed by different Convolutional Neural Network(CNN)is compared and analyzed.An improved Long-term Recurrent Convolutional Networks(improved LRCN)is proposed.LRCN is a typical action recognition network.It adopts a circular convolution structure and feeds the features of AlexNet's first full connected layer fc6 into Long Short-Term Memory(LSTM)for time-series relationship modeling to fully extract spatial and temporal information.Based on the LRCN,this paper proposes an improved LRCN network,replacing the AlexNet network in the LRCN with a deeper ResNet-34 network.Experiments show that the features extracted by ResNet-34 have better expressive capabilities than those extracted by hand-crafted and shallow CNN(such as AlexNet),which can effectively extract spatial features and improve the overall performance of relevant action recognition algorithms.(2)A Residual Fourier Temporal Pyramid(ResFTP)model is proposed,which can effectively model the long time sequence of human action.The proposed ResFTP algorithm first uses ResNet-34 network to extract intra-frame spatial features first,and then uses Fourier Temporal Pyramid(FTP)to model long-time relationships of the proposed features.Finally,the Support Vector Machine(SVM)is used for classification and recognition.The FTP can effectively represent the temporal relationship of behavioral video.By aligning video with variable frame lengths into feature vectors in the same dimension,high-frequency noise can be removed while realizing the modeling of long-term relationships of human action,which makes the proposed mode robust.Experiments show that the proposed ResFTP algorithm can achieve better recognition results.Compared with the recent TRN algorithm,the UCF101 and HMDB51 databases have been improved by 1.92%and 1.83%,respectively.(3)A Squeeze-and-Excitation Long-term Recurrent Convolutional Networks(SE-LRCN)is proposed,which can exploit the intra-frame and inter-dependency of video action by constructing attentional mechanisms on pixel and frame granularity.The proposed SE-LRCN algorithm uses the feature recalibration strategy for extracting intra-frame and inter-frame feature by the corresponding SE-ResNet-34 and SE-LSTM networks respectively.Finally,the softmax function is used to normalize the output values to get the final prediction distribution.The proposed algorithm takes the dependencies and importance of feature channels into account when extracting spatial features.The inter-frame dependencies and the degree of importance could be taken into consideration when timing modeling.Experimental results show that the proposed SE-LRCN algorithm can achieve better recognition results.On the UCF101 and HMDB51 databases,the recognition accuracy is increased by 11.37%and 10.33%respectively,compared with LRCN.In addition,the performance is improved by 1.53%and 1.57%further,compared with improved LRCN algorithms.

Keywords/Search Tags:

Deep Learning, Deep Residual Network, Fourier Temporal Pyramid, Feature Recalibration

PDF Full Text Request

Related items

1	Research On Semantic Segmentation Algorithm Based On Feature Fusion And Deep Learning
2	Research On Deep Learning Techniques Based On Deep Residual Network And Its Applications In Vision
3	Research On Image Highlight Removal Based On Deep Learning
4	Depth Estimation Of Monocular Image Based On Deep Learning
5	The Study Of Deep Learning Based Human Pose Estimation
6	Face Recognition Based On Deep Residual Network
7	Application Research Of Deep Learning In Defect Detection Of Mobile Phone Data Interface
8	Research On Deep Residual Learning-Based Visual Object Tracking Algorithm
9	Research On Speaker Identification Based On Deep Learning
10	Small Target Detection In Optical Images Based On Deep Learning