Font Size: a A A

Human Action Recognition Based On Deep Learning

Posted on:2017-03-23Degree:MasterType:Thesis
Country:ChinaCandidate:C GengFull Text:PDF
GTID:2308330491450321Subject:Signal and Information Processing
Abstract/Summary:PDF Full Text Request
Human action recognition has become a hot spot in the field of image processing, computer vision, machine learning, etc. for its broad application prospect in reality. Deep learning, inspired by the human visual mechanism, has obtained a breakthrough progress and brought a new research direction of human action recognition at the same time. Based on a series of algorithms, deep learning used to acquire the high-level abstractions from data without any supervision, by using multilayered nonlinear transformations. Different from traditional recognition methods to extract features by hand, deep learning automatically learns high-level features from low-level features which never rely too much on the task itself and decrease the computer calculation. This paper focuses on human action recognition under complex scene and feature extraction in the video both in space and time dimension so that overcome the difficulty of recognition caused by the environment difference and time variation.Based on the study of typical deep model like convolutional neural networks and deep belief networks, this paper proposed several novel algorithms. These innovations are briefly summarized as follows:(1) Research on action recognition of RGB images in complex scene. A new convolutional neural network model is proposed for extracting features from 2D pictures and then classifying with a softmax regression. It has a unique advantage in image processing in complex scene as its invariance of Specific posture, illumination, environment and disorderly change. Instead of The traditional backward propagation algorithm, initializing filters of the model with a trained CAE stack yields superior performance on a KTH database.(2) Research on spatio-temporal feature learning. A 3D convolutional neural network model is proposed, which exacting information of consecutive video frames both in time and space dimension. At the same time, in order to accelerate the operation speed of the network, in the premise of high resolution of the original input stream, a stream of low resolution input is added that forms a new two stream 3D convolutional neural network framework. Experiments show that this method achieved similar results traditional algorithm without any priori information.(3) Since the previous two study points all mark on extracting features from RGB images and also considering the spatial and temporal information, this paper presents a recognition model based on RGB-D video data. This paper utilizes depth images to extract the feature vector as input of a pyramid multilayer deep belief network, supplemented by the improved restricted Boltzmann machine learning algorithm, this model is effective and robust and greatly reducing computation cost comparing with the previous algorithm.
Keywords/Search Tags:Action Recognition, Feature Representation, Deep Learning, Convolutional Neural Network, Deep Belief Network
PDF Full Text Request
Related items