Font Size: a A A

Research On Human Action Recognition Technology Based On Convolutional Neural Network

Posted on:2019-02-20Degree:MasterType:Thesis
Country:ChinaCandidate:S D ChenFull Text:PDF
GTID:2438330545956936Subject:Computer application technology
Abstract/Summary:PDF Full Text Request
Human action recognition has wide application prospects in the areas of intelligent video surveillance,video-based content analysis and retrieval,human-computer interaction,virtual reality and medical care.Human action recognition technology is a technology that computer classifies human action by extracting discriminative features from the action of video or image sequences.Convolutional Neural Network(CNN)is a kind of imitation of biological neural network,different from the traditional hand-crafted features of the action recognition method,which is through hierarchical non-linear conversion from the low-level features to learn high-level features to achieve high-level abstraction of the data,and it also solves the dependence of the feature extraction on the task itself in action recognition problem.This paper focuses on the construction of different human action recognition model,which can not only detect the moving object from the background,but also the diversity of action changes(such as a movement in different scenes of different expression or different individuals express the same action difference and the action with occlusion,etc.)is robust.The mainly completed research work of this paper is as follows:(1)To improve the K-Means clustering algorithm by adding the roulette algorithm when we build the visual vocabulary.Bases on Harris-Laplace algorithm combined with 3D-SIFT descriptor,and then construct the visual vocabulary by Bag-of-feature model.Finally,using multi-class support vector machine(SVM)as the classifier.By adding the roulette algorithm to the clustering algorithm,which makes the clusters scattered and improves the accuracy of action recognition.(2)Applying Batch Normalization idea in the field of image classification to action recognition field.We construct a network structure combining batch normalization algorithm with GoogLeNet network model.We carry out operation of batch normalization algorithm for the output features of convolutional layers in convolution neural network and then input them to next layer.Compared with traditional convolutional neural network,training algorithm as well as network structure is improving,thereby improving the accuracy of action recognition.(3)We use the improved convolutional neural network structure to construct the spatio-temporal two stream network.The spatial stream network obtains the appearance information of action through the RGB images of the video frames whilethe temporal stream captures the action information through the optical flow field between consecutive frames.Finally,we fuse the spatio-temporal network,which can not only consider the appearance information but also concern the motion information,to achieve the purpose of improving the accuracy of action recognition.(4)In view of the timing of action video,we construct a three-dimensional convolutional neural network model of 26 layers.The two-dimensional convolution in the traditional convolutional neural network is expanded into a three-dimensional convolution,and the three-dimensional convolution operation is performed on the input video data or the image sequence directly to extract action spatio-temporal information of continuous multiple video frames.
Keywords/Search Tags:Action recognition, Convolutional neural network, 3D CNN, Batch normalization
PDF Full Text Request
Related items