Font Size: a A A

Human Action Recognition Based On Local Spatio-temporal Interesting Points

Posted on:2016-03-14Degree:MasterType:Thesis
Country:ChinaCandidate:M M LuFull Text:PDF
GTID:2348330503488401Subject:Electronic and communication engineering
Abstract/Summary:PDF Full Text Request
As one of the hottest topics in computer vision, human action recognition has high commercial value and application value. And it has been the main research in many fields. Some video surveillance systems in public places and video sharing sites are required to recognize human action automatically. Two parts are studied in this paper, including single person action recognition and multi-persons interaction behavior recognition.For single person action recognition, the Harris3 D detector is used to extract interest points. And the HOG3 D descriptor is calculated as apparent features of interest point. Then the Kmeans clustering algorithm is adopted to form the traditional bag of words model. As the traditional BOW model ignores the order of words, we introduce the local spatio-temporal distribution information of interest points. And the bag of distribution words is obtained by clustering. The improved bag-of-words model is composed by apparent words and distribution words. The histogram based on the word's frequency is calculated to represent each video. Finally, the SVM is trained for multi-classes classification. Single person action recognition is tested on two datasets: Weizmann and KTH. The experiment results show that the enhanced BOW algorithm can effectively reduce the confusion between actions, and achieve the better recognition rate.For multi-persons interaction behavior recognition, interesting frames are extracted before recognition. Interesting frame is the place that the abnormal behavior may take place.The selection of interesting frames can be divided into two cases. When there is no occlusion, the distance between two persons is calculated. If the distance is less than the threshold, the frame is considered as interesting frame. When there has occlusion, we record the time T1 that the distance of two persons is less than the threshold at first time. Then,we record the time T2 when the distance of two persons is greater than the threshold at first time after time T1. The frames between T1-T2 are considered as interesting frames. It can not only avoid the calculation of the independent frame, but also the computation time of algorithm and the number of false interest points are reduced. The classification accuracy is increased to a certain degree. Experiments are carried out on CASIA and UT-interaction datasets. And the experiment results show that our algorithm is effective.
Keywords/Search Tags:human action recognition, spatial-temporal interest points, local distribution features, bag-of-words algorithm
PDF Full Text Request
Related items