Font Size: a A A

Research On Human Pose Estimation And Pose Distance Metric Learning

Posted on:2020-01-09Degree:DoctorType:Dissertation
Country:ChinaCandidate:W H ZhangFull Text:PDF
GTID:1368330623956363Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
Human pose estimation refers to the technology of calculating the position of human skeleton or joint points based on the data of acceleration,image and video obtained by contact or non-contact sensors.With the development of computer graphics,computer vision,pattern recognition and human-computer interaction,human pose estimation and pose distance metric has been widely used in motion recognition,motion simulation,video surveillance analysis,behavior retrieval and other fields,and become an important research in computer vision,pattern recognition and human-computer interaction.According to the input data,non-contact human pose estimation methods can be divided into color image-based method and depth image-based method.The former is based on visible image,which is easily affected by background,illumination,occlusion and other factors.The latter is based on the scene depth of field obtained by structured light or time-of-flight method.Although it avoids the disturbance of visible image affected by environmental illumination and other factors to a certain extent,it has some data defects such as noise,voids and unclear features due to the depth image.Human pose estimation based on depth image is still a challenging research topic in the field of pattern recognitionIn addition,the traditional human pose distance metric learning method based on skeleton point information is usually based on the Euclidean distance model or sparse model,and requires a large number of manually labeled similar/dissimilar pairs of pose data as training data.However,considering the complexity of the movement and behavior of the human,and some human motion database contains a lot of noise data or low confidence data,the traditional Euclidean distance model or sparse model is difficult to accurately describe the pose similarity of the moving human body.Based on the above problems,this thesis aims at high-precision human pose estimation for depth image and RGB-D image,and studies methods such as pixel feature representation for depth image,feature fusion for RGB-D image,unsupervised human pose distance metric learning,and so on.The human motion sensing computing system was developed.The main work of this article are as follows:1.Aiming at the inaccuracy of pixel feature characterization in the traditional human body pose estimation for depth image,a depth image human pose estimation method based on the hybrid feature of depth difference and geodesic distance is proposed.Traditional depth image based pose estimation usually uses the depth difference feature of pixel pairs with the random decision forest classifier.Although the depth difference feature is simple to calculate,it is only a description of the local relative depth attribute between pixel pairs.It is difficult to accurately describe the connection relationship between non-rigid human parts with complex deformation,self-occlusion,high noise and other characteristics.The method proposed method by this thesis combines the hybrid feature with the superpixel on depth image to describe the context information about the pixel,which reduces the interference from non-uniform depth data and noise.The method is not only makes for the efficiency of feature extraction and random decision forest(RDF)training,but also improves the classification accuracy of the human body parts.At the same time,the thesis also proposes a pose estimation framework based on the combination of RDF-based part classification and clustering-based sparse regression.The superpixel hybrid feature is combined with the component clustering center features in different quality and resolution data sets.The results show the robustness,efficiency and accuracy of the method.2.Based on the Dempster-Shafer theory,in this thesis we propose an appearance-shape fusion model for human pose estimation from RGB-D data.Traditional human pose estimation methods describe features such as HOG,SIFT or contour descriptors as the feature of skeletons or body components,and train corresponding scoring models from intensity images.It is difficult to overcome the low accuracy problems caused by illumination and noise on intensity images.Image features provided by different information sources are generally one-sided,inaccurate,incomplete,and may even be completely contradictory.How to effectively integrate these features and apply it to human skeleton point calculations is the research difficulty of human pose estimation.The proposed appearance-shape fusion model fuses the HOG features of the intensity information source and the contour features of the depth information source based the D-S theory,and overcomes those problems by making full use of the complementarity of both information sources.Experimental results show that the model achieves an efficient estimation for human pose on RGB-D data.3.The traditional human pose distance metric learning requires a large number of labeled samples and inaccuracy.We propose an unsupervised model called Bilayer Sparse Model with sparsity-induced adaptive neighborhood for learning pose distance metric.Considering the similarity of human pose has multi-level characteristics,that is,rough pose composed of head,arms,feet,extremities and trunk,and full pose composed of wrist,elbow,shoulder,wrist,knee,etc.The model divides the offsets between the human skeleton points into coarse pose data and full pose data,and utilizes the sparse representation on each type of the pose data,which captures the underlying structure of human pose by exploiting the relationship of pose data with respect to different scales.The experimental results compared with the state-of-the-art semi-supervised models show that our proposed unsupervised model achieves a better result on pose retrievals.4.Based on the above innovative research work,a prototype system for non-contact human-computer interaction is constructed.The Human Motion Perception System integrates data acquisition,manually labeled,pose estimation and result evaluation algorithms of various parts of the human body into the platform through modular design,which facilitates the collaborative development of various modules of the human motion perception project and has good scalability.As an application platform for high-precision human motion perception instruments,the practicality of the system is verified.Experiments show that the proposed human pose estimation algorithm is accurate and efficient,and the proposed pose distance measurement learning model is feasible and effective.
Keywords/Search Tags:Human pose estimation, Sparse representation, Dempster-Shafer fusion theory, Pose distance metric, Unsupervised learning
PDF Full Text Request
Related items