Font Size: a A A

Reasearch On Action Recognition Modeling Of Video Sequences

Posted on:2020-04-14Degree:DoctorType:Dissertation
Country:ChinaCandidate:Q J XuFull Text:PDF
GTID:1368330611955432Subject:Information and Communication Engineering
Abstract/Summary:PDF Full Text Request
Activity recognition in video sequences is a hot topic in the field of computer vision research due to its wide application.The activity recognition research in the scenario where the scene is simple and the camera is fixed has made great progress,but in complex scenes there are still many challenges in activity recognition.How to find effective visual feature representations in noisy real scenes,as well as efficient and robust machine recognition algorithms that can meet real-time processing requirements,will be pursued in a long period of time.In order to improve the performance of behavior recognition,we have done the following work:1)Activity recognition of probabilistic latent semantic analysis model.By mining the co-occurrence pattern to represent the activity in videos,probabilistic latent semantic analysis model enhances the discriminability of features.In order to further improve the recognition performance,the impact of encoding methods combined with different normalization is explored.Local soft assignment combined with power normalization improves the recognition performance when sparse spatio-temporal features are utilized.Then preprocessing of raw features using principle component analysis is investigated,through which,while the dimension of features and computing quantity are reduced,the performance is even improved when raw features contain considerable noises.Detailed experiments are conducted on KTH and UT-interaction datasets.The results show that an appropriate combination of encoding and normalization methods could significantly improve the performance of probabilistic latent semantic analysis model.The recognition accuracy reaches 96.44% and 95% on UT-interaction set1 and set2 respectively,which outperforms the state-of-the-art.Especially,we obtain 94.24% on UT-interaction set1 using sparse STIPs.2)Over-completely sparse coding activity similarity recognition Research on activity similarity recognition focuses on whether the activities are similar or not,which is very valuable for understanding the activity in videos and provides a new idea about cross-database recognition.A activity similarity recognition method based on over-completely sparse coding is proposed.Firstly,the Gaussian mixture model is learned on the training set.Then,for each mixed model component,the subcodebook is learned and the codebook of each component is integrated,and an over-complete codebook is obtained;when encoding features,firstly features are classified by Gaussian mixture model.In order to retain more feature information,the three components with the highest probability are retained and normalized;the features of the components belonging to GMM are sparsely coded by the corresponding codebook;finally,the support vector machine is utilized for classification.The method learns the submanifold structure of the feature space through the Gaussian mixture model.On each component,the feature is coded with a relatively small scale dictionary,which not only reduces the requirements for computing power,but also improves the ability to describe activity.Experiments on the ASLAN database verify the effectiveness of the proposed method.3)Fisher vector and Vector of locally aggregated descriptors based activity recognition To address the information loss problem caused by the hard quantization of Vector of locally aggregated descriptors and Fisher vector only counting the first and second statistical moment,two improved methods are proposed.Firstly,the effects of principal component analysis preprocessing on coding performance are discussed.On this basis,two improved methods are proposed.Firstly two soft allocation methods take the place of vector quantization,and the soft-allocated version of the local aggregate descriptor vector method improves the performance of the local aggregate descriptor vector coding;secondly,the higher-order moment statistic of the feature distribution provides more information about the feature,so the higher statistic moments of the feature are integrated into the coding of Fisher vector to obtain a new vector combining high-order moments.Experiments on KTH,UT,UCF sports and UCF101 data sets verify the effectiveness of the proposed method.4)Spatio-temporal based super-vector activity recognition The spatio-temporal relationship between features contains rich information,which is very important for improving the performance of activity recognition in videos.Based on the previous chapter,an activity recognition method of super-vector based on spatiotemporal information is proposed.Different from the previous chapter,we integrate space-temporal information into super-vector coding.First,we extract the spatio-temporal interesting points of the feature and cluster according to the position coordinates of the spatio-temporal interesting points,so the features are segmented into spatio-temporal subvolumes;in each spatio-temporal subvolumes,various high-order statistical moments to encode local feature point sets;finally,local statistical vector is combined with global Fisher vector coding to form the super vector representation of video.The proposed method combines the distribution characteristics of features both in global and local,and incorporates the spatio-temporal relationship between features.The experiments on KTH,UCF sports and UCF101 database have achieved better recognition rate,and especially noting that higher accuracy than some deep learning features based methods is got on UCF101.
Keywords/Search Tags:activity recognition, topic model, Fisher vector, spatio-temporal information, Gaussian mixture model
PDF Full Text Request
Related items