Font Size: a A A

Human Pose Estimation Based On Deep Learning And Pictorial Structure Model

Posted on:2019-12-31Degree:MasterType:Thesis
Country:ChinaCandidate:H B DaiFull Text:PDF
GTID:2428330572452108Subject:Signal and Information Processing
Abstract/Summary:PDF Full Text Request
Human pose estimation is a hot issue in computer vision,and it is widely used in the field of behavior recognition,human-computer interaction and motion analysis.Human pose estimation is the process of detecting the 2D or 3D human body joint candidate positions and according to the space connections among human body joints to infer the optimal body pose from still images or videos.In a natural image,the lack of accurate and robust estimation of human pose could be due to the complex scenes and severe occlusion,varied illumination,different clothing and other factors.Therefore,in the unconstrained environment,it is a great challenge to achieve robust pose estimation with high precision for human bodies with high-degree of freedom.This paper aims at human pose estimation for natural images or videos,and it uses the deep learning network and the pictorial structure model to better describe the appearance model of human body parts and the spatial constraint model among human body parts,fully expressing the complex human body structure and solving problems in the process of static or dynamic human pose estimation in natural images.The main research contents are as follows:(1)A human pose tracking method based on a cascade error-correction mechanism is proposed,for the sake of solving the problem that it is difficult to get the accurate wrist positions in the process of human pose tracking for video sequence images.Compared with other human body joints,the wrist movement has large amplitude and no fixed trajectory,which results in the lowest locationing accuracy of the wrist among the whole human body estimated joints.However,can be accurately estimated the human pose at a certain time,mainly depends on the positions of the wrist.This method focuses on the inaccurate wrist locations and utilizes temporal information unique to video sequences images and an adaptive skin color model to design a cascade error-correction mechanism for gradually selecting an accurate wrist position to achieve smooth and stable he human pose tracking results.Firstly,inferring all body joint positions and calculating the posterior edge distribution probability of wrists in each frame by using the pictorial structure model.Secondly,a method based visual tracking is fused the posterior edge distribution probability of wrists from the first stage,which aims to obtain the wrist location.Finally,the cascade error-correction mechanism is used for correcting the predicted wrist positions.The experimental results on VideoPose2.0 and VIPS-VideoPose datasets verify the effectiveness of the proposed method for tracking dynamic human poses,especially on improving the wrist locationtioning accuracy.(2)A human pose estimation method based on Faster R-CNN is proposed,for the sake of solving the problem that is difficult to get the robust appearance model of human body parts in the process of still human pose estimation.The traditional methods almost have poor robustness because artificial extracted features are vulnerable to background environments,illumination variations and difference in clothing.This method focuses on the appearance model with poor robustness for the traditional methods and utilizes excellent extraction characteristics of deep learning networks to put forward the target detection network Faster R-CNN as a human part detector,successfully improving the accuracy of human body parts for still human pose estimation.Firstly,using the Faster R-CNN to model appearances of human parts and generating candidate rectangles of each human part.It mainly utilizes the features extracted by deep learning networks to replace the artificial extracted features in the traditional methods so that it can more fully describe the visual changes of human body part appearances in an unconstrained environment.In addition,this method also design a new spatial constraint model among human body parts for effectively selecting the optimal position for each part in the whole image from the candidate rectangles generated by Faster R-CNN.The experimental results on FLIC and Buffy Pose datasets fully illustrate that compared with other methods,the proposed method has a very good performance on still human pose estimation.(3)A human pose estimation method based on the integration of a deep learning network and a multi-level pictorial structure model is proposed,for the sake of improving the accuracy of still human pose estimation from both part detection and human spatial structure.Human pose estimation in still images can be elaborated as an example of target detection or described by the traditional pictorial structure model.However,the two methods have their own shortcomings,which are either the traditional pictorial structure model has poor robustness in the environment with a high degree of freedom or it ignores the spatial relations among human joints if processing the whole human pose estimation just from a point of view of target detection.This method proposes a framework of fusing a convolutional neural network into a multi-level pictorial structure model to improve the still pose estimation results from the two aspects of part detection and human spatial structure.In the stage of part detection,the probability vectors corresponding to human body parts in the whole image are regressed by a convolutional neural network.In addition,a new multi-level pictorial structure model is designed in this method,which contains the whole body,human body parts and joints and realizes the coarse-to-fine establishment of a more constrained human spatial constraint model from levels of the structures,the edges to the pixels.The obtained probability vectors is put into the multi-level pictorial structure model to inferring the location coordinates for each joint,successfully achieving a framework of combining the deep learning network and the multi-level pictorial structure model.This method is tested on the public LSP dataset,and the experimental results show that compared with other methods,the integration of the deep learning network and the multi-level pictorial structure model can to a greater extent improve the accuracy of still human pose estimation.
Keywords/Search Tags:Human Pose Estimation, Pictorial Structure Model, Deep Learning, Cascade Error-correction Mechanism, Faster R-CNN, Multi-level Pictorial Structure Model
PDF Full Text Request
Related items