Font Size: a A A

Human Action Recognition Based On Spatial-temporal Manifold Learning

Posted on:2015-12-09Degree:MasterType:Thesis
Country:ChinaCandidate:H J LiuFull Text:PDF
GTID:2308330473951545Subject:Electronic and communication engineering
Abstract/Summary:PDF Full Text Request
Human action recognition has been an important research topic in the community of computer vision and pattern recognition. It has important significance in both academic research and engineering application. The research of human action recognition mainly includes three modules: the extraction and description for human action spatio-temporal information, the extraction for human action intrinsic structure of the spatio-temporal characteristics and the theory of human action recognition. The extraction for human action intrinsic structure of the spatio-temporal characteristics is the key step of action recognition. Since human action has these properties: the non-rigid motion of human body, the variability of human appearance such as the difference of action performers and the changes of environments,the high spatio-temporal complexity and long temporal correlation. This paper aims to extend the traditional manifold learning methods into the spatio-temporal domain to extract more discriminative intrinsic structure of the spatio-temporal characteristics, which lies on the following three aspects:(1) Among local manifold learning algorithms, inspired by the Locality Preserving Projection(LPP) and its variants, we focus on maximizing distances between frames that are similar in appearance but are from different classes, namely maximizing the between-class distance. When calculating the pairwise weights in local between-class neighborhood graph, we introduce temporal information to emphasize the spatio-temporal difference of those boundary points. The constraint condition of orthogonal projection matrix is adopted in the optimization of object function to better preserve the metric structure of the action frame space. Finally, a novel manifold embedding method, Maximum Spatio-Temporal Dissimilarity Embedding(MSTDE), is proposed to embed each action frame into a manifold, where frames from different action classes can be well separated.(2) Among global manifold learning algorithms, based on t-Distributed Stochastic Neighbor Embedding(t-SNE), we present supervised learning methods,S-tSNE and ST-tSNE, by introducing the class label information and the temporal information, which could make the action frames of the same class close to each other in low-dimensional space. Since the limitation of t-SNE series methods during incremental learning, that there are no explicit maps when new data embed to low-dimensional space. We separately utilize the thoughts of Locality Preserving Projection(LPP) and Local Linearity Embedding(LLE), to realise new data’s low-dimensional embedding according to the local neighborhood information in high-dimensional space.(3) Posture silhouettes are used as features for the action frames. The proposed manifold learning methods are adopted to extract the human action intrinsic structure of spatio-temporal characteristics. A variant of Hausdorff distance is introduced for frame and sequence classifications. Extensive experimental results and comparison with state-of-the-art methods demonstrate the effectiveness and robustness of the proposed methods for human action.
Keywords/Search Tags:human action recognition, spatio-temporal manifold learning, maximum spatio-temporal dissimilarity embedding, supervised learning
PDF Full Text Request
Related items