Font Size: a A A

Research On Human Action Recognition Based On Depth Map Sequences

Posted on:2019-12-14Degree:DoctorType:Dissertation
Country:ChinaCandidate:X P JiFull Text:PDF
GTID:1368330596456232Subject:Pattern Recognition and Intelligent Systems
Abstract/Summary:PDF Full Text Request
In recent years,human action recognition has gradually become a research hotspot in the field of computer vision,which has attracted a great attention of researchers.The emerge of the consumer depth camera mitigates the difficulty of the target detection and segmentation tasks using conventional visual images,showing excellent performance in the task of human posture estimation and action recognition,and provides new ideas for human motion recognition.Focusing on the human action recognition using depth sequences,we conduct on a series of research work from four aspects : low-level feature extraction,middle-level feature encoding,spatio-temporal feature representation and feature learning using deep learning.The main contributions of this dissertation is as follows:(1)A skeleton embedded in depth map based method for human action recognition.The human body is divided into different motion parts by the position of embedded skeleton joints,and then a local spatio-temporal model is constructed to obtain a compact representation of low-level features extracted from the motion parts.A simplified Fisher Vector method is used to encode the low-level features with various durations and generate feature vector with uniform length.The experimental results have demonstrated that the high real-time performance of the proposed method,that meet the demand of real-time human action recognition in some scenes.(2)A spatio-temporal cuboid pyramid representation for human action recognition.The depth motion sequences with three-view projections are divided into subsets by the spatio-temporal cuboid pyramid.By employing the proposed cuboid encode scheme,the final feature vector with strong space-time description ability is generated from the subsets.The experimental results have demonstrated that the proposed method can gave a better recognition performance in a relatively simple scene.(3)A spatial Laplacian and temporal energy pyramid representation for human action recognition.The depth map sequences are decomposed into high frequency and low frequency components distributed in different spatio-temporal positions by the proposed spatial Laplacian and temporal energy pyramid.And two different low-level features are extracted from the high frequency components and low frequency components,respectively.The final action class is obtained by fusing two kinds of features.The experimental results have demonstrated that the proposed method can effectively describe spatial appearances and temporal motion information.Furthermore,it has obvious advantages in the aspects of recognition accuracy and computational efficiency compared with other state-of-the-art methods.(4)A ResNet based network using two stream information fusion for human action recognition.By employing the original depth data as apparent stream and the oriented gradient vector extracted form depth data as motion,a pseudo 3D ResNet are constructed to fuse two stream at early state,which can learning representation of high-level features between two streams.The experimental results have demonstrated that the proposed method has shown excellent performance by using half scale of parameters of orthodox 3D convolution.This method can achieve the state-of-the-art performance on the NTU RGB+D large-scale dataset.
Keywords/Search Tags:Human Action Recognition, Depth Map sequence, Feature Extraction, Feature Representation
PDF Full Text Request
Related items