Font Size: a A A

Research On Methods Of Feature Representation For Human Action Recognition In RGBD Video

Posted on:2018-06-20Degree:DoctorType:Dissertation
Country:ChinaCandidate:M HuangFull Text:PDF
GTID:1368330545497329Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
Human action recognition,a hot research topic in computer vision,can be applied to many scenarios,such as intelligent video surveillance,human activity analysis,video retrieval,etc.Recently,researchers discovered that depth information provided by RGBD sensors,such as Kinect,improves the efficiency of the action recognition results.Based on a survey of human action recoginition in RGBD videos,we found these works have some common drawbacks:1)A large image area is always required to extract feature descriptors,leading to high computational complexity and high dimensionality;2)These features generally lack semantic meaning.It is very difficult to judge whether or not a descriptor makes sense;3)Multiple features are fused to improve recognition accuracy,but the fusion is difficult.To solve these drawbacks,we focused our reserach on represent actions with discriminative features.The major works and contributions are summarized as follows.1.In order to solve the high dimension and little semantic meaning,a Random Forest(RF)Out-of-Bag(OB)estimation based approach is proposed to select discriminative parts for each action.First,all the features of joint-based parts are separately fed into the RF Classifier.The OB estimation of each part is used to evaluate the discrimination of the joints in the part.Second,joints with high discrimination for the whole dataset are selected to design feature.The Discriminative Parts(DPs)reflect a rich semantic clue for each action;therefore,the actions are distinct from each other.Experiments conducted on the MSR Action dataset show that our proposed method extracts high DPs for the semantic meaning of each action.The results reveal that our method outperforms state-of-the-art methods in accuracy with lower feature dimensions.2.Most feature dimensions are still too high for real-time application.Part-level motion clustering can help reduce dimensions and verify actions.To this end,a meta-action descriptor for action recognition in RGBD video is proposed in this study.Specifically,two discrimination-based strategies-dynamic and discriminative part clustering-are introduced to improve accuracy.Experiments conducted on the MSR Action 3D dataset show that the proposed method significantly outperforms the methods without joint position semantic.3.Multiple features,such as depth appearance and joint motion,describe actions in different aspects.Some features are discriminative to particular action,but not discriminative for other actions.In previous work,those features are always fused to improved recognition accuracy.And all the actions are described by the same feature.These feature fusion approach may result in high feature dimension and without making use of feature discrimination.Therefore,a novel approach is proposed to select multiple features according to RF's classification entropy.Based on RF's classification entropy,the discrimination of depth appearance feature is evaluated for each testing sample.Then one of the depth appearance and joint motion feature is selected to classify the testing sample.In particular,a new single feature--Discriminative Part Depth Motion Maps based Convolutional Neural Network(DPDMM-CNN)-is proposed to describe actions.Our feature selection approach involves a small number of parameters and produces features with low dimention.Experiment conducted on the MSR Dailly Activity 3D dataset show that the feature selection approach outperforms the single feature and classical feature fusion based approaches.To sum up,this thesis aims at extracting features with high discrimination,low feature dimension and rich semantic meanings.Three novel approaches of feature representation for 3D human action recognition are proposed.The OB estimation of RF was applied to learn the DPs.And the learned DPs are applied in the feature clustering,CNN modeling and feature selection.Experimental results show that our proposed method is superior to other methods in recognition accuracy,computational cost,and semantic meaning.Relate results are beneficial to the development of recognition,feature coding and classification,etc.
Keywords/Search Tags:Human Action Recognition, RGBD, Discriminative Part, Clustering, Feature Selection, Convolutional Neural Network
PDF Full Text Request
Related items