Spatio-temporal Feature Learning And Human Activity Analysis In Complex Scenes

Posted on:2013-02-05

Degree:Master

Type:Thesis

Country:China

Candidate:Y Zhu

Full Text:PDF

GTID:2218330362459209

Subject:Pattern Recognition and Intelligent Systems

Abstract/Summary:

PDF Full Text Request

This paper studies on an important issue of computer vision: human activity analysis and spatio-temporal feature learning in complex scenes. It is not only related to the traditional basic vision problems like feature detection and description, but also high level problems in machine learning like semantic analysis and understanding. In this thesis, the author focuses on human activity recognition and classification, including sparse coding with local spatio-temporal feature, unsupervised deep learning on spatio-temporal features from videos and crowd counting and analysis based on unsupervised Bayes trajectories clustering and other problems.Based on broad and in-depth reading literatures from related international conferences and transactions, the author made extensive research and analysis, proposed several novel algorithms and methods, and finally proved the effectiveness of the research through engineering application and experiments.Research on human action recognition. In this thesis, a new human action recognition method is proposed, using sparse coding on local spatio-temporal features to replace the traditional"Bag of Words"model, and further obtain the global video representations through max-pooling. Besides, the author research on dictionary learning of sparse coding, and combines transfer learning to generalize the dictionary to related tasks, and improve the classification accuracy.Research on spatio-temporal feature learning. Inspired by theory of deep learning, the author proposed a hierarchical distributed probabilistic model, learning invariance of spatio-temporal features in a unsupervised way. Given an input video, the model is able to learn hierarchical feature representations in a bottom up unsupervised way. From experiments, it is proved that unsupervised learning without label information can achieve comparable accuracy with supervised learning in action recognition tasks.Research on crowd analysis and counting algorithms. This method works without using any prior model to detect human. By tracking the low level vision features, this method get a set of trajectories of moving human and conduct unsupervised clustering to estimate the overall number of humans in the whole video. The author and other colleagues in the lab collected the video data, and proved the effectiveness of the algorithm on this dataset.

Keywords/Search Tags:

action recognition, sparse coding, deep learning, deep belief network, feature learning, trajectory clustering, crowd counting

PDF Full Text Request

Related items

1	Human Action Recognition Based On Deep Learning
2	A Crowd Counting Algorithm Based On Deep Learning
3	Multimedia Content Analysis Based On Unsupervised Feature Learning
4	Single Image Crowd Counting Based On Deep Feature Fusion
5	Crowd Counting By Deep Learning
6	The Representation And Recognition Of Trajectory Data Based On Path Signature Feature And Deep Learning Methods
7	Action Recognition Based On Deep Learning Framework
8	Research On Crowd Counting Method Based On Deep Learning
9	Crowd Counting Algorithm Based On Deep Convolutional Neural Network
10	Optimization Design For Deep Belief Network And Its Applications