Research On Human Action Recognition Based On Temporal And Spatial Characteristics And Deep Learning

Posted on:2017-08-27

Degree:Master

Type:Thesis

Country:China

Candidate:X Y Li

Full Text:PDF

GTID:2348330533950311

Subject:Electronics and Communications Engineering

Abstract/Summary:

The research on human action recognition in video is extreme hot. Human action recognition is divided into two steps: feature extraction descriptor and feature classification. Feature extraction descriptor is to describe the human action using the feature extraction. But the problem is how to extract the feature of human action accurately under complex underground. Meanwhile, feature classification is to classify the feature descriptor and decide the different kinds of action. However, The main problems the design and choice of the proper classifier. To solve the above problem, we present two approaches to human action recognition, the main research works of this thesis are as follows:(1) The human action recognition method based on spatial-temporal interest points and HOG-3D descriptors is proposed in this thesis. Firstly, collecting the dense spatial-temporal interest points based on the grayscale video frames. Secondly, the HOG-3D descriptors of dense spatial-temporal interest points are constructed fro m collected grayscale video frames. Thirdly, the HOG-3D descriptors is established based on grayscale video frames and spatial-temporal interest points. Finally, the establishment of word bags model by the K-means clustering algorithm, the histograms of every video features are constructed and the method of support vector mac hine is used for human action recognition and classification.(2) The human action recognition method based on convolutional neural networks and spatial-temporal interest points. Firstly, collecting the dense spatial-temporal interest points based on the grayscale video frames. Secondly, the spatial-temporal interest points of the whole video frame are mixed as one image. Finally, the image is considered as the convolutional neural networks input and the artificial label in convolutional neural networks is used to make a classification.The KTH dataset characteristics of this kind video are simple shooting environment, uniform conditions and simple of human action. The Hollywood2 dataset is more complex than the KTH dataset. This kind of video shot is usually close to the life scene, and there is jitter during the shooting process. Our experiments show that using the two methods can achieve a higher recognition rate and has strong robustness.

Keywords/Search Tags:

Spatio-Temporal Interest Points, HOG-3D, SVM, Fusion Interest Points, Convolutional Neural Networks

Related items

1	Pig Behavior Recognition Based On Spatio-Temporal Interest Points
2	Research On Unsupervised Activity Recognition Based On Spatio-Temporal Interest Points
3	Human Action Recognition Based On Spatio-temporal Interest Points
4	Spatio-Temporal Interest Points (STIP) Based Method Of Recognizing Human Action
5	Research On Points Of Interest Recommendation Integrating Spatio-Temporal Background In Location-Based Social Networks
6	Human Action Recognition Based On Spatio-temporal Interest Points
7	Research On Spatio-Temporal Interest Point Based Human Action Recognition
8	Research On Human Behavior Recognition Based On Clouds Of Space-Time Interest Points
9	Techniques On Interest Points Detection And Image Matching
10	Content-based Image Retrieval Based On Multi-feature Fusion Of Interest Points