Font Size: a A A

Research On Human Action Recognition Based On Computer Vision

Posted on:2019-01-12Degree:MasterType:Thesis
Country:ChinaCandidate:Y ZhangFull Text:PDF
GTID:2428330548986600Subject:Computer software and theory
Abstract/Summary:PDF Full Text Request
With the rapid development of sensor technology,the Internet and the maturity of machine learning theory,human action recognition technology in video is attracting more and more attention from researchers.Human action recognition technology has high academic and commercial value.It can be applied in many scenarios such as human-machine interaction,intelligent monitoring,motion analysis,video retrieval,and so on.The effect of traditional human action recognition method depends largely on the extracted features.In this way,the computation process is too complex,and the feature is not strong enough.In this paper,deep neural network is applied to simulate biological brain's process of visual information processing,to achieve video human action feature extraction.It can adapt to human action recognition in complex environment,simplify the process of traditional manual feature extraction,and improve recognition accuracy.First of all,this paper constructs a 3D convolution neural network model which is extended to three dimensions.Considering the influence of significant regional change human motion change during the whole movement on action recognition,this paper uses three-frame difference method to compute the human body movement change region to get an inter-frame differential channel.The original gray video and the inter-frame differential channel form a dual-channel as the input of the 3D convolutional neural network to extract feature.Experiments on KTH dataset show that the dual channel 3D convolution neural network constructed in this paper can achieve 92.5% recognition accuracy,which reduces the difficulty on feature extraction and improves the robustness of the algorithm.In addition,lots of parameters on the recognition performance of 3D convolutional neural network was studied by experiments.The results show that the 3󫢫 convolution kernel is slightly lower than 5󬊅,but has a higher efficiency.The logarithmic likelihood function has a faster convergence rate than the ordinary mean square variance function.And dropout can avoid overfitting on small data sets to a certain extent.Then,in order to extract enough features on dataset which has large amount of video,rich content and complex background,like UCF-101,the transfer learning principle is used in this paper.The CNN pre-training model is trained in the ImageNet data set.Then the trained classified network weights are migrated to UCF-101 dataset and are fine tuned to extract features.In view of the difference in execution time of human action in reality,this paper does not use traditional frame alignment,but uses LSTM network to identify variable length human action sequences.The human action recognition model based on CNN pre-training model and LSTM network achieves 88.7% accuracy on UCF-101 dataset,which validates the effectiveness of the model in video human behavior recognition.
Keywords/Search Tags:computer vision, human action recognition, deep learning, convolution neural network, transfer learning
PDF Full Text Request
Related items