Font Size: a A A

3D Human Pose Estimation In The Wild

Posted on:2020-06-25Degree:MasterType:Thesis
Country:ChinaCandidate:X H SongFull Text:PDF
GTID:2428330590495231Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
Human beings are important cognitive objects in computer vision tasks.Human pose can convey rich information.Cognitive interpretation of gestures is a basic ability of human vision.3D human pose estimation refers to the process of detecting the position of the key part of the human body from the picture or video data,and further obtaining the state of the three-dimensional human body posture.Human body posture estimation has broad application prospects in the fields of video surveillance,human-computer interaction,etc.It also can provide effective information support for the research of behavior recognition,abnormal behavior detection and other issues.Based on the single picture,the 3D human pose estimation has problems such as ambiguity and poor estimation in natural scenes.In this paper,the benchmark pose estimation model is improved for these problems,and experiments are carried out to verify the effective of the methods.This thesis uses a deep learning-based approach to estimate 3D human pose from a single RGB image.Deriving 3D human pose estimation from a single perspective has its own ambiguity problem.To sovle the ambiguity in 3D human body pose estimation,a human body limitation module is introduced,which is used to encode the dependence relationship between the joint of human body.Through this module the result estimated by the original model can be adjust to fit the the body limtiation.The training data of the 3D pose estimation model are basically collected in the indoor scene.Because of this,trained model is more suitable for the indoor scene.Aiming at the complex background of the 3D pose estimation model in natural scenes,the baseline model fuse the dataset with 2D annotation and dataset with 3D annotation.To make full use of the 2D annotation,this paper introduce the weakly supervised learning strategy.In the natural scene,the size of the object in the image changes more,so the feature fusion strategy is introduced to fuse the different levels of the deep convolution neural network,combined with the strong semantics of the highlevel features and the high-resolution features of the lower-level features.This can make the model to handle the different size of object.The subject mainly uses Human 3.6M dataset with 3D pose annotation and MPII public dataset with 2D pose annotation for experiment.In the experiment,the imporved model and baseling model was compared.The results of experiment confirm the effectiveness of the improved model.In the Human 3.6M test dataset data,the mean per joint position error after alignment is 1.9% lower than the baseline model and have better generalization performance in the wild.
Keywords/Search Tags:human pose estimation, human pose constrain, feature fusion, weakly supervised learning
PDF Full Text Request
Related items