Font Size: a A A

Research On Algorithms Of Human Action Recognition Based On Videos

Posted on:2017-01-10Degree:MasterType:Thesis
Country:ChinaCandidate:C H HuangFull Text:PDF
GTID:2348330485484599Subject:Signal and Information Processing
Abstract/Summary:PDF Full Text Request
Human action recognition has become one of the most active research topics in computer vision, which is widely used in areas such as video surveillance, human computer interaction and action analysis. This thesis studies on two algorithms based on the covariance matrix of spatio-temporal features for human action recognition. One is based on the improved log-euclidean bag of words, and the other one is based on the Stein kernel sparse coding. The main content of this thesis is as follows:1. For action descriptors, we discuss feature covariance matrix represent human action, which can fuse multiple features and is a low-dimensional representation of action. For action feature extraction, we comprehensive analysis the action representation ability of gratitude and optical flow, then recombine the gratitude and optical flow feature which explicitly capture the edge and motion dynamics, we also introduce the robust shape feature which based on spatio-temporal silhouette information.2. We propose the improved log-euclidean bag of words for action recognition. We extract the covariance matrix of spatio-temporal cuboids of video segments, in order to re-use the knowledge base of Euclidean space, a key idea is to embed the covariance matrices into log-euclidean vector space. We extend the improved bag of words(BoW) model to action recognition, first learning a codebook with spectral clustering instead of the traditional clustering algorithms, which is simple to implement and outperforms traditional clustering algorithms such as k-means, then study on locality-constrained linear coding method rather than soft/hard assignment coding or sparse coding, which is good reconstruction, local smooth sparsity and can fast approximation. Pooling with the spatial pyramid model is used to achieve compact feature representation. At last we classify human action with the nonlinear support vector machine.3. We propose an algorithm based on the Stein kernel sparse coding. To reduce computational complexity and data redundancy, we partition the video into successive overlapping segments to enhance the diversity of action, and calculate the feature covariance matrix. To improve the representation ability of feature covariance matrix, we introduce a dimension reduction approach for covariance matrix, which does not need to change space and covert data type, and can minimizes the intra-class distances while simultaneously maximizing the inter-class distances. Combine feature covariance matrix with the Stein kernel sparse coding to recognize human action, which is simple and with high recognition accuracy. The directly residual error and Euclidean classifiers can be used to classify human action. We discuss the nearest-neighbor classifier with affine invariant Riemannian metric(AIRM) or Stein divergence. We make a human action dataset under the condition of monitoring, and assess the practicability of the algorithm.
Keywords/Search Tags:Action recognition, Covariance matrix, Improved Log-Euclidean bag of words, Dimension reduction of covariance matrix, Stein kernel sparse coding
PDF Full Text Request
Related items