Font Size: a A A

Research On 3D Human Pose Estimation In The Wild

Posted on:2022-03-13Degree:MasterType:Thesis
Country:ChinaCandidate:X H JiFull Text:PDF
GTID:2518306491455014Subject:Computer system architecture
Abstract/Summary:PDF Full Text Request
3D human pose estimation is a key problem in computer vision,which is wildly used in human-computer interaction,animation production,visual surveillance,etc.This dissertation focuses on the problem of 3D human pose estimation in the wild.There exist two main challenges: First,the recovering 3D human pose from one monocular image is ill-posed.Monocular images lose a lot of depth information,which makes the transformation from 2D to 3D highly nonlinear.Second,the human pose data sets in the wild are very limited.Most of the existed 3D human pose data sets are obtained in laboratory environment,which is captured in the certain scene.As a result,the corresponding model cannot be well generalized to the scene in the wild.This dissertation focuses on how to improve the accuracy and generalization ability of 3D human pose estimation model in the wild.By employing a model based on weakly-supervised transfer learning as baseline model,the research is carried out in the following two aspects:(1)An attention mechanism and a pose calibration module are introduced to improve the accuracy of human pose estimation.Firstly,the channel weight learning algorithm based on attention mechanism is used to enhance the depth information analysis ability of the baseline model,so as to regress the depth coordinates of each joint better;Secondly,a multi-scale 3D human pose calibration network is introduced to construct three human skeleton pose models from different scales,so as to adaptively learn the structure and motion characteristics of human body,and to achieve the purpose of calibrating the estimation results of the baseline model.Compared with the baseline model,the proposed model can reduce the MPJPE on the Human3.6M dataset test set by 3.28 mm.(2)A method of acquiring a data set in the wild with high quality labels based on Bayesian Network is proposed to improve the generalization ability of the baseline.Taking the outputs of several existing 3D human pose estimation models as weak labels,combining with the ground truth,we use Bayesian Network to learn the dependencies between them.Then,for a given in the wild image set,we can acquire a data set with high quality labels to fine-tune the baseline model.The experimental results on LSP and MPII datasets show that the fine-tuned model has better generalization ability for images in the wild.
Keywords/Search Tags:3D human pose estimation, Attention mechanism, Multi-scale 3D human pose calibration network, Bayesian Network
PDF Full Text Request
Related items