Font Size: a A A

Research On Human Action Recognition Based On Computer Vision

Posted on:2016-07-22Degree:DoctorType:Dissertation
Country:ChinaCandidate:Y H ShaoFull Text:PDF
GTID:1108330479985490Subject:Instrument Science and Technology
Abstract/Summary:PDF Full Text Request
Human action recognition(HAR) has become one of the most active research topics in the computer vision and pattern recognition field recently, due to its wide applications, such as visual surveillance, video retrieva, behavior anomaly detection, motion sensing game, etc. Moreover, the results of the research on HAR will indirectly enlighten the research of other fields, such as gait recognition, face recognition, expression analysis, etc. However, the flowing troublesome issues often in human action recognition. First, the deficiency of single representation method of action on human action recognition. Second, while most work on HAR has been visible-spectrum oriented, little research has been done into thermal infrared imagery based HAR. Thirdly, almost thermal infrared imagery have shortcomings of texture missing and edge blurring. Last but not the least, many algorithmic methods have the feature of complexity which influence the system on the properties of real-time and practicability. These factors are a little hard to effectively overcome but well worth the effort.This dissertation has deeply analyzed and researched the flowing key technologies for computer vision based human activity recognition, including the activity modeling and representation, multi-feature fusion and construction, the strategy for choosing the right fusion method, feature selection, reducing the high dimension of feature vectors, the classifier design, etc.The main contribution of this dissertation can be concluded as follows.(1) In order to overcome the deficiency of single representation method of action on human action recognition, a new recognition algorithm of human action based on multi-feature fusion and support vector machine(SVM) is presented. The proposed algorithm(HOSOOF, Histogram of Oriented Silhouette and Oriented Optical Flow) consists of three essential cascade modules. First, the human silhouette is obtained by separating the salient regions and the background based on background subtraction. Then, the fusion multi-feature, HOSOOF, is constructed by using two types of available features the histogram of the oriented silhouette(HOS) and the histogram of the oriented optic flow(HOOF). Finally, the multiple features, HOSOOF, were sent to the SVM for recognizing the human activity. The experimental results show that the proposed method can achieve the correct recognition rate above 99.8% for the Weizmann benchmark data set. Moreover, interrelated analyses conclude that the proposed algorithm to be effective and promising. The recognition performances between the SVM classifiers and some other mainstream classification techniques are also compared, which further verifies the effectiveness of the proposed algorithm.(2) Most work on HAR has been visible-spectrum oriented. However, this part used thermal infrared imagery for HAR. In order to overcome the deficiency of single scale and individual representation method of action on HAR, a new recognition algorithm of human action using dense trajectories-based multi-feature fusion is presented. Our method comprises the following steps: a). the dense trajectories(DT) of the input action video clip are obtained by using dense sampling; b). three dense trajectories-based descriptors are constructed, such as Histogram of Oriented Gradient(HOG), Histogram of Optical Flow(HOF) and Motion Boundary Histograms(MBH); c). the fusion feature is constructed by using the popular bag-of-features(Bo F) representation of HOG, HOF and MBH, respectively. Here, both the dense trajectories and MBH are adopted for modeling the infrared human actions for the first time. Consequently, a k-NN classifier is employed to recognize the human actions using the computed dense trajectories-based fusion features(DTFF). The intensive experimental results show that the proposed method can achieve the best correct recognition rate above 96.67% on the benchmark thermal infrared action dataset, IADB. Moreover, interrelated analyses conclude that the proposed algorithm to be effective and promising for visible and infrared human action recognition.(3) We studied the human activity recognition with global high level representation, which helps reduce complexity of HAR. We put forward a novel approach for human activity analysis based on motion energy template(MET), a new high-level representation of video. The main idea for MET model is that human actions could be expressed as the composition of motion energy acquired in a three-dimensional space-time volume by using a filter bank. The motion energies were directly computed from raw video sequences, thereby some problems, such as object location and segmentation etc., are definitely avoided. Another important competitive merit of this MET method is its insensitivity to gender, hair and clothing, etc. We extract MET features by using Bhattacharyya coefficient to measure the motion energy similarity between the action template video and the tested video, then the 3D max-pooling. Using these features as input to the SVM, extensive experiments on two benchmark datasets, Weizmann and KTH, were carried out. Compared with other state-of-the-art approaches, such as variation energy image(VEI), dynamic templates and local motion pattern descriptors etc., the experiment results demonstrate our MET model is competitive and promising.(4) Finally, We put forward a novel method called simplified MET(SMET) in which the two-tier octree Max-pooling is used for simplifying MET during the feature pooling phase for the first time. And then the multi-class relevance vector machines(m RVM) is constructed to classify the SMET features derived from filter bank. Experimental results on the Weizmann dataset demonstrate that the proposed method can achieve the correct recognition rate above 96.67%. The amount of calculation is greatly decreased, although the correct recognition rate fell by 1.1% in MET.Based on the above results of the research, this dissertation systematically and fundamentally investigated the task of HAR. Using those methods can have a positive impact on HAR system performance. Together, they would enrich the current theories of HAR and the future research on this area. Also, they have important reference value for development of dynamic scene understanding, movement analysis, etc.
Keywords/Search Tags:Action Recognition, Multi-features Fusion, Kernel Method, Dimension Reduction, Motion Energy
PDF Full Text Request
Related items