3D human reconstruction technology aims at restoring the 3D geometry of the human body from 2D human body information.The reconstructed 3D human body can be used in various fields such as video production,game development,medical services,virtual fittings,and so on.Therefore,3D human reconstruction has very important research value.Traditional human reconstruction methods use 3D scanners based on structured light or phase measurement to reconstruct fine human body meshes.However,3D scanners are too expensive,so this method is not universal.With the continuous development of computer vision and deep learning technology,some advanced computer vision models can already reconstruct a complete 3D human body from a single RGB image.However,single image brings a lack of information in the depth of image,body shape and back details.As a result,single-view 3D human reconstruction faces many difficulties.For example,the poses of the human body in the input images are diverse,so there will inevitably be some occlusion of human parts.If reasonable inference calculations are not obtained,problems like reconstruction artifacts or limb incompleteness will appear during the reconstruction process.In the face of many technical challenges,this paper conducts in-depth research on parametric and non-parametric methods of single-view 3D human body reconstruction.Based on the innovation of multi-attribute priors,the process of 3D human reconstruction is normalized.The main research contents of this paper include the following three aspects:(1)To regress parameters of the parametric human model,a method for regressing skinned multi-person linear(SMPL)human body based on graph convolution and vertex offset is proposed.Instead of regressing the template parameters of the SMPL human body from the image features of the input image directly,this method first iteratively updates the 3D coordinates of each mesh vertex through a graph convolutional neural network.Then it extracts the SMPL parameters from the regression results.In order to reduce the geometric bias of the reconstruction results in the depth direction of the z-axis,a depth regression reconstruction loss is proposed.It uses the parameter space of SMPL to constrain the deformation of human joints,and further improves the accuracy of human depth pose prediction while ensuring the integrity of human body.The experimental results show that this method exhibits competitive results.For one thing this method is more accurate for the prediction of local joints,for another it overcomes the depth ambiguity problem in 3D space to a certain extent.(2)To address the problem of missing information caused by single-view images,a rear-view image synthesis method based on conditional generative adversarial networks(GAN)is proposed.The Implementation of this method is based on the core idea of the conditional GAN.This method uses the human contour as a semantic label for supervised transformation,and generates the corresponding rear view image inferred from the input human front view image.The network structure of this method consists of a coarse-to-fine image generator model and a multi-scale discriminator model.The multi-scale features of the generator model and the discriminator model can take into account both local and global features of the image.Each submodule only needs to process a small size of image data,so the method has higher computational efficiency.The experimental results show that the method can effectively generate rear-view images of human body as expected.The multi-scale model structure of the method can also be better adapted to high-resolution image-to-image translation tasks.(3)To address the problem of reconstruction artifacts and incomplete limbs in single-view 3D human reconstruction,an implicit 3D human reconstruction method based on parametric models and normal features of front and rear views is proposed.The network structure of this method is divided into three modules: the prediction of SMPL parametric model,the normal inference from front and rear views,and implicit surface reconstruction.Firstly,the corresponding SMPL parametric model and rear-view image are predicted based on the input images respectively.Then the volume features extracted from the SMPL model and the normal features extracted from the front-view and rear-view images are used as additional parameters of the deep implicit function to assist training.In this way,the reconstruction results can maintain a more standardized human structure under prior constraints.The experimental results show that the method can effectively handle the diversity of human poses and shapes.Its generalization ability and scalability are also better than several existing methods. |