Font Size: a A A

Human Interaction Recognition Research And System Design Using Spatial-temporal Pyramid Joint Features

Posted on:2016-01-17Degree:MasterType:Thesis
Country:ChinaCandidate:J ZhangFull Text:PDF
GTID:2308330473460905Subject:Electronic and communication engineering
Abstract/Summary:PDF Full Text Request
Human action recognition is an important area of computer vision research. It has been widely used in the system of intelligent store, intelligent monitoring and so on. However, research on the interaction between two persons is still difficult, compared with the human action recognition. In this paper, we mainly study feature extraction and action modeling of interaction recognition, and design the system to implement the algorithm.The main contributions are as follows:Firstly, With the idea of multi-features fusion, we propose an algorithm for feature extraction and processing based on spatial-temporal pyramid. The algorithm effectively fuses two features:trajectory feature with the global changes and spatial-temporal feature with regional movements.On the one hand, we make the trajectory by the Kanade-Lucas-Tomasi(KLT) algorithm. Fourier operator is employed to describe the outline, which is combined with the magnitude and direction to form the low-level feature. The bag of words(BOW) model is implemented on the low-level feature to form the global descriptor. On the other hand, we extract Histogram of Oriented Gradient and Histogram of Weighted Optical Flow to describe the information of regional movements. Then these two features are processed with sparse coding and Max-pooling algorithm based on the spatial-temporal pyramid. At last, the sparse global and regional features are fused by weigted connection. Experiments show the descriptor making good foundation for the action modeling and classification with low redundancy and high discrimination.Secondly, the recognition algorithm based on Dynamic Latent Conditional Random Field(LDCRF) is realized and design the interaction system. Experiments results on the UT, BT and Hollywood datasets show that the proposed algorithm getting good performance. The interaction system contains three modules: object detection and tracking, feature extraction and action modeling and analysis. The system can realize real-time analysis for the input video and output the results of feature extraction and human interaction recognition.
Keywords/Search Tags:Trajectory Feature, Spatial-Temporal Feature, Sparse coding, Spatial-Temporal Pyramid, Latent Dynamic Conditional Random Field
PDF Full Text Request
Related items