Font Size: a A A

3D Human Regression Network Based On Supervision Of Reconstructed Data

Posted on:2023-09-07Degree:MasterType:Thesis
Country:ChinaCandidate:H Q ZhangFull Text:PDF
GTID:2568307103993379Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
Reconstructing 3D human bodies from RGB images or videos has a wide range of applications in different fields such as human-computer interaction,robotics,video analytics,virtual reality and augmented reality,film production,and so on.Although a large number of works has been devoted to this area,3D human pose estimation based on monocular images or videos still faces a variety of difficulties,such as the lack of depth information,occlusion of image characters,cluttered background and lack of training data,which lead to poor reconstruction result.The pose and shape accuracy of the model is not high enough,and details such as facial expressions and gestures are not restored in place.In a class of reconstruction methods based on parametric representation,approaches based on the optimization has high accuracy,but the efficiency is low;the method based on image depth information also has high accuracy but too many outliers;while the method based on the parameters regression of deep neural network has high efficiency but low accuracy.In view of the above problems,this paper conducts two studiesA cascaded reconstruction network for 3D human hand model from monocular RGB image is proposed.First,a CNN model is used to extract sparse features from the image;then a multilevel perceptron is used to regress the hand model parameters from the sparse features;then,the regression parameters are used to initialize an iterative optimization routine to fit the hand model to the 3D joint points;finally,the iteratively fitted human parameters in turn are used to supervises the entire network.A fully 3D human shape and pose reconstruction algorithm,METRO-X,is proposed,combining parametric models with non-parametric methods using a hybrid approach,taking full advantage of the accuracy of non-parametric methods(METRO)and the compactness of SMPL-X model results(low-dimensional subspace),regress the pose and shape parameters of the full human body from a single RGB image.First,CNN is used to extract image features,and then input to transformer to regress the vertex coordinates,and then the model parameters of SMPL-X are regressed from the predicted vertices through a multilayer perceptron.This paper implements a modular system that can detect the bounding boxes of faces,hands,and the body itself in images,reconstructs their 3D shapes and poses at multiple scales for the three parts based on METRO-X,respectively,and finally three different parts are fused together to output a fully 3D human model with facial expressions,gestures,and body poses.The experimental results show that the 3D human body reconstruction system designed in this paper can accurately restore the body shape and posture,hand gestures and facial expressions of the characters in the image.Compared with the Expose method,the body accuracy is improved by about 23%,and the gesture accuracy is improved.about 35%.
Keywords/Search Tags:3D human body reconstruction, 3D hand reconstruction, 3D facial reconstruction, Transformer
PDF Full Text Request
Related items