Research For Action Recognition Based On Spatial-Temporal Stream Convolution Neural Networks

Posted on:2018-08-08

Degree:Master

Type:Thesis

Country:China

Candidate:D W Zhao

Full Text:PDF

GTID:2428330596453019

Subject:Information and Communication Engineering

Abstract/Summary:

Human pose estimation and behavior recognition in video are widely used in intelligent monitoring,medical diagnosis,human-computer interaction and motion analysis,which makes it become a research topic in computer vision field.However,due to the high degree of freedom of human posture and the complexity of the behavioral data set,making this work is facing great difficulties,especially in the action of the subtle differences in performance is more obvious.The emergence of convolution neural network brings convenience to the feature extraction phase of image recognition,which avoids the complexity of manual design and becomes the hotspot of various fields.In this paper,based on the self-learning characteristics of convolutional neural network,the classification of subtle movements for complex data sets is improved by thinning the input of the network:(1)In the posture estimation stage of behavior recognition,this paper uses NBest algorithm to generate N posture candidate sets for each video frame.Then,the candidate sets are decomposed by limbs to obtain larger candidate data sets based on limbs,and finally the posture reconstruction is performed by limb reorganization from top to bottom.In the case of positioning the wrist and elbow position,this paper improves the positioning accuracy by introducing the next frame virtual link of the part.Experiments show that posture reconstruction based on limb decomposition has better evaluation performance,especially the positioning of wrist and elbow has been improved obviously.(2)In the feature extraction stage of behavior recognition,the characteristics of the self-learning feature of deep convolution neural network are used,and the training is based on the ILSVRC-2012 pre-training model.In order to fully extract the static information and motion information of the action,the network structure based on time and space flow parallel learning is proposed,and then the action classification is completed by feature fusion.Experiments show that the feature fusion based on the spatial-temporal network can improve the recognition performance to a great extent.(3)Aiming at the problem of low efficiency of subtle operation on complex data sets,a multi-position segmentation strategy of attitude is proposed.Since most of the selected data sets are visible only to the upper body of the human body,this article only considers the segmentation of the arm and upper body.In order to extract the optical flow information between successive frames of the segment,the size of the separated image is normalized,and then the fixed size RGB image and the optical flow image are input as the space-time convolution neural network respectively.Finally,the extracted features are carried out Fusion and classification in SVM.The experimental results show that the multi-part segmentation scheme has better classification effect than traditional convolution neural network in JHMDB and MPII Cooking.

Keywords/Search Tags:

spatio-temporal stream convolution neural network, behavior recognition, limb decomposition, posture segmentation

Related items

1	Recognition Of Upper Limb Posture Based On Deep Convolution Neural Networks
2	Research On Human Behavior Recognition Method Based On Graph Convolutional Networks
3	Behavior Recognition Based On Deep Neural Network
4	Research On Video Behavior Recognition Method Based On Deep Learning
5	Research On Video Behavior Recognition Technology Based On Spatio-Temporal Modeling
6	Research On Spatio-temporal Graph Convolution Behavior Recognition Algorithm For Intelligent Surveillance Scene
7	Research On Human Action Recognition Algorithm Based On Spatio-temporal Graph Convolutional Network
8	Human Action Recognition Based On Spatio-temporal Feature
9	Human Behavior Recognition Method Based On Double-stream Deep Convolution Neural Networks
10	Research On Human Action Recognition Based On Spatio-temporal Graph Convolutional Neural Network