Human action recognition has been a sharp focus in computer vision with many real-world applications, such as video indexing, motion analysis and security monitoring, which could enrich our daily life while guaranteeing public security. However, descriminating actions with subtle variances is still a challenge. Three kinds of actions, i.e. walking, jogging and running, are the most difficult list for their similar gestures and the lost sampling rate though they are at a different natural pace. Efforts are seldom mentioned before.This thesis focuses on the methods of effective feature description and feature quantization along dense trajectories. Main works are as follows.1. Due to the inadequate description of actions by using Short-Term Trajectory Shape (STTS) based on local feature matching, a trajectory-based descriptor of local motion trend called Long-Term Trajectory Shape (LTTS) is suggested to capture discriminative temporal relationships between different local features, which combine with a series of short-term velocities and their Motion Direction Changes Histogram (MDCH). Then we evaluate LTTS on the three actions, i.e. walking, jogging and running. Experimental results demonstrate that a comparable classification accuracy of 93.67% is obtained by using LTTS, which is 2.67% above STTS, and 2%,4%,2% are improved for walking, jogging and running respectively.2. Bag-of-words (BOW) is in the use of hard assignment (HA) when creating bag-of-feature (BOF), which may lead to randomness and mandatory. Hence, we introduce fisher vector (FV) based on Gaussian Mixture Model (GMM) to quantize LTTS. Furthermore, given the difference between sub-trajectories in various directions, we propose multi-direction composite FV (CFV), which is the hybrid of the sub-FV in horizontal and vertical directions. Experimental results show that CFV reports a superior recognition performance than FV and BOF. As for Weizmann, CFV obtains the best previous result by 100%, which is 6.2% higher than FV and 8.3% higher than BOF. As for KTH, an overall 95.5% accuracy is achieved by CFV, which is 0.17% higher than FV and 2.68% higher than BOF. |