Font Size: a A A

Research Of Human Action Recognition Method Based On Features With High Robustness

Posted on:2017-01-22Degree:MasterType:Thesis
Country:ChinaCandidate:D X WuFull Text:PDF
GTID:2308330485960390Subject:Software engineering
Abstract/Summary:PDF Full Text Request
With the advance of the Internet age and the continuous progress of science and technology, the computer can do with the visual information acquisition, transmission, processing, storage, and understanding like the human, and thus human action recognition based on video has attracted much research interest, what’s more, it has great significance and a broad application in the areas of video surveillance, video retrieval, human-computer interaction, health care, virtual reality, etc.Human action recognition has developed rapidly in recent years, has made some important research results and practical application of technology. For the part of feature extraction, we focus on global representations and local representations. For the part of action classification, we highlight Support Vector Machine, Hidden Markov Model and Conditional Random Fields. Feature selection directly affects the performance of the identification operation. While, global representations encodes the region of interest of a person as a whole using background subtraction or tracking and they are derived from silhouettes, edges or optical flow. For noise, changing perspective and occlusion, the robustness of the global representations are not high. Local representations describe the observation as a collection of independent patches. Generally, local representations are based on the detection point of interest and focus more on correlations between patches, so they are less sensitive. But using the local representations to describe the whole features will inevitably bring information loss. In addition, the expression is not intuitive and they are difficult to achieve localization.Based on the BING feature for object detection, we add a time dimension and propose space-time binarized normed gradients (ST-BING) as a new feature, and successfully apply it in action recognition and localization. ST-BING consists of space binarized normed gradients (S-BING) and time binarized normed gradients (T-BING), where T-BING captures motion information of different actions and S-BING records the attitude information of the action actors. Since we use the multi-scale sliding window to deal with each frame of vide, our method weak the complex background of human action recognition, so ST-BING has scale invariance for different actions and higher robustness. In addition, ST-BING is approximated by the binary representation. For feature extraction, classification action and testing, we can use the bit operation to calculate and reduce the computational complexity greatly. Also, we use the action cascade SVM classification model. Compared with the traditional SVM, the model takes into account the possibility of differences in the scale of the window that contains the action and improve the recognition accuracy.In this thesis, we carry out some experiments on the UCF-Sports database with K-fold cross-validation, and human action recognition and positioning accuracy rate have reached 85.4% and 44.8%. For the hardware configuration of current mainstream personal computer, the speed of recognition and localization have reached 0.2 sec/frame around. Experimental results show that our method which using the ST-BING binding cascade SVM to deal with human action recognition and localization is feasible and effective.
Keywords/Search Tags:Human Action Recognition, Feature Extraction, Action Classification, ST-BING
PDF Full Text Request
Related items