Human Action Recognition Based On Spatio-temporal Feature

Posted on:2018-01-20

Degree:Master

Type:Thesis

Country:China

Candidate:X J Fan

Full Text:PDF

GTID:2348330512987085

Subject:Computer application technology

Abstract/Summary:

PDF Full Text Request

Video-based human action recognition research is one of the hotspots in computer vision research,and has a wide application prospect,such as human-computer interaction,video surveillance and virtual reality.However,it is the main focus of the researchers to make high recognition accuracy under the influence of complex background and external factors(such as light,occlusion and movement),and it is also an urgent problem to be solved in the current behavior recognition research.It is proved that the method of human behavior recognition based on spatio-temporal feature is an effective method to solve the above problems.In this paper,we study the spatio-temporal feature extraction and propose the improvement strategy.The main work and contributions of this paper are as follows:1)Mixed spatio-temporal feature descriptor based realistic human action recognition.The realistic human recognition based on local spatial-temporal feature is an important research field of human behavior recognition.How to obtain effective points of interest,a reasonable description and characterization of motion feature point of interest is the key point of their research.To do this,the multi-scale Dollar's spatio-temporal interest points are firstly extracted from the input video,and then extract the video block describing local motion region by means of spatio-temporal interest points;Furthermore,a novel multidirectional projection optical flow histogram(DPHOF)descriptor is proposed to represent the video volume together with the orientation histograms of 3D gradient orientations(3DHOG);SOM is used to generate the global video descriptor.Finally,the KNN is employed as classifier.Experimental results on UCF-YT and KTH datasets show: the proposed method has better recognition results than the state of the art.2)Realistic human action recognition based on Dropout Convolution Neural Network.Convolution Neural Network(CNN)has become one of the hotspots in many scientific fields.As a kind of depth model,convolution neural network can be applied to the original input directly,do not need to design features descriptor manually.This paper made the following improvements on the 3D convolution neural network: Using Gabor Wavelet kernel to initialize the convolution operation,so as to achieve the simulation of human visual system response to visual stimuli;In the process of network training,Dropout technology is added to remove some neurons randomly,so as to improve the generalization ability of the network and prevent over-fitting.In this paper,this method is validated on the KTH and UCF-YT datasets,and has achieved good recognition results.

Keywords/Search Tags:

Spatio-temporal interest pionts(STIPs), Orientation histograms of 3D gradient orientations(3DHOG), Optical flow histogram(HOF), Self-Organizing feature Map(SOM), Convolution Neural Network(CNN), Gabor Wavelet kernel

PDF Full Text Request

Related items

1	Violent Event Detection Algorithm Based On The Spatio-temporal Features Of The Video
2	The Pedestrian Detection Based On Spatio-Temporal Interest Point And Histograms Of Oriented Gradients
3	Research On Surveillance Video Synopsis Based On Spatio-Temporal Slice
4	Actions Recognition Based On Convolution Neurlal Network And Composite Feature
5	Video Action Recognition Based On 2D Convolution Network Under Spatio-Temporal Feature Enhancement Mechanism
6	Research On Convolution Neural Network Behavior Recognition Based On Optical Flow Characteristic
7	Research On Video Behavior Classification Technology Based On Spatio-Temporal Features
8	Person Re-identification Algorithm Combining Spatio-temporal Apparent Feature Fusion With Feature Matching
9	Lighting Uneven Conditions Of Optical Flow Measurement
10	Abnormal Detection In Crowd Scenes Based On The Histograms Of Oriented Optical Flow And Sparse Representation