Font Size: a A A

Research On Action Recognition Method Based On Multi-feature Fusion

Posted on:2019-10-03Degree:MasterType:Thesis
Country:ChinaCandidate:C FengFull Text:PDF
GTID:2428330590465727Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
In the past 40 years,computer technologies such as computing technology,network technology and storage technology have developed rapidly,and completely affecting and transforming human life.Researchers in the field of computer science had focused on the study of computational theory,but the study of computer application technology has gradually risen.Human action recognition,as an important and challenging topic of computer vision technology,has attracted the attention of many researchers.Although the human action recognition technology has a wide range of applications,there are still many technical problems that have not been solved,such as a large change in human posture,a change in perspective,and a cluttered background.A popular method of dealing with action recognition problems is to classify and recognize based on the local features of video.The method based on local features first extracts the local features from the video,quantifies with encoding,and then uses the machine learning algorithm to learn the classification model.Finally,the new sample being classified by the model.In the recent years,the deep learning method has developed rapidly in the computer vision field.Researchers have proposed some methods to do action recognition based on deep learning theory.The main idea of these methods is to use the convolutional network to automatically learn the features of the video frames then encoding and do the classification.This thesis focuses on the two-stream network model based on deep learning method and the bag of visual word model(BOVW)based on local features for research and innovation,and proposes an effective action recognition method.The main work of this article is reflected in the following two aspects.1.An action recognition method based on hashing feature and two-stream network model has proposed.The classical two-stream network model does not focus on key frames in video,but extracts features from video as a whole.In this way,the model can hardly improve the difference in the class,thus affecting the recognition rate.In this thesis,a hashing window with different sizes is proposed to select the key frame sequence from the adaptive selection of video fragments of indeterminate length.Firstly,CNN frame features were extracted using the classical network model of pre-training,and the difference of CNN frame feature was compared,and the comparison result was mapped to video feature representation.Then,hashing features and two stream features has concatenated and normalized to a whole feature.Finally,the experimental results show that the binary hash feature is effective in improving the recognition accuracy.2.An action recognition method combining visual word bag model and C3 D network model is proposed.Classical visual word bag model can express the local features of human action efficiently.C3 D model can extract features from video on spatial and temporal scales.Deep network features tend to pay attention to facial information of human body and ignore information of other parts.Therefore,this paper proposes a feature fusion method combining two models for motion recognition.Experimental comparison on some data sets shows the effectiveness of this method.
Keywords/Search Tags:action recognition, two-stream network, C3D model, feature fusion
PDF Full Text Request
Related items