Font Size: a A A

Action Recognition Based On Deep Learning

Posted on:2019-03-23Degree:MasterType:Thesis
Country:ChinaCandidate:Q P XiaFull Text:PDF
GTID:2428330566977082Subject:Instrument Science and Technology
Abstract/Summary:PDF Full Text Request
With the rapid development of video obtation technology and the Internet,video has rapidly become an important carrier of information sources in people's daily lives.And now digital video is in an exponentially growing state.The huge amount of video has brought a lot of difficulties to everyday life,such as video classification,retrieval,and human behavior recognition in video.Among them,the human behavior recognition of video has become a research hotspot,this article focuses on this topic.The main goal is to extract the characteristics that can effectively express the human behavior from the video,then fully integrate the video's duration information and time,and finally we build up the network model.Accumulation will extract the network to achieve accurate classification of human behavior in the video.The main work of the dissertation is as follows:(1)In view of the multiple temporal dimensions of the video relative to the images,this paper proposes video segmentation fusion to identify human action recognition.Convolutional neural networks usually deal with single-frame image.Dealing with video is the dilemma of containing time information.This paper first uses optical flow algorithm to get the characteristics of video short-term information,in order to get the time information more robust,multi-frame optical flow are used.In order to obtain the long-term time information of the video,the video is divided equally into multiple segments,and the feature superposition is used as the network input.The optical flow characteristics in each segment are convoluted by convolution after the network.Finally,the video signal is fully validated through experimental analysis.The length of time information can improve the behavior is other accuracy.(2)For the problem that the optical flow algorithm cannot extract special cases when the video is stationary,a robust principal component analysis algorithm is proposed to extract the sparse low-rank features of the video for behavior recognition.The algorithm considers the video data as a whole and consists of sparse components and low-rank components.The low-rank components represent the background of the video,and the sparse components represent the human behavior in the video.These two components are composed of pixel levels and are well-complex.There are two kinds of features in the video,and they are very robust.(3)In recent years,rapid development of deep learning has emerged with a large number of excellent networks,such as AlexNet,VGG,Inception-bn,ResNet etc.For the problem of using convolutional neural networks in deep learning to build a model,a residual two-stream network is used in this paper.In the course of the experiment,we found that training different network get different action recognition accuracy rates.In these types of networks,the residual network has the best feature extraction capabilities.On this basis,a two-stream network with sparse and low-rank flows is built.Sparse features and low-rank features are fused into the classifier after passing through the residual network.Finally,the experimental analysis verifies that the residual two-stream network improves the accuracy of action recognition.
Keywords/Search Tags:human action recognition, deep learning, robust PCA, ResNet for two-stream network
PDF Full Text Request
Related items