| Single-view 3D face reconstruction is one of the popular research directions in the field of computer vision,which generates realistic 3D faces from a single 2D face image and plays a key role in the fields of animation rendering and 3D content creation.However,this technique also faces many challenes,such as the difficulty of obtaining labeled 3D data sets,the tendency of overfitting or underfitting of neural network training,and the poor reconstruction accuracy.In this paper,we explore a self-supervised approach to achieve more realistic 3D face reconstruction.Specifically,through the existing 3D face model data,we use the neural network for self-supervised learning,so that the network can learn and extract the features of face shape and texture,and perform 3D reconstruction based on this,in order to achieve higher reconstruction accuracy and more realistic 3D faces.Traditional 3D face reconstruction methods usually fail to capture facial details and individual features,as well as the problems that the generated results tend to be average-model.In order to solve these problems,model-free reconstruction has become a hot topic of current research,but the reconstruction accuracy of existing methods still needs to be improved.The main reason is that the existing model-free methods are inadequately designed in terms of sub-network structure and have insufficient ability to extract features for two-dimensional face images.To solve these problems,this paper proposes a self-supervised 3D face reconstruction scheme based on learning multi-task information such as albedo,depth,illumination and viewpoint to obtain the hidden information of each image and reconstruct a more detailed 3D structure.The method uses neural networks to learn the intrinsic laws of the data,thus improving the accuracy and precision of the reconstructed faces.The work in this paper includes the following main aspects:1.In order to improve the learning capability of the network model,this paper proposes a dual attention method with multi-scale feature fusion for improving the 3D face reconstruction accuracy.In this method,a multi-scale feature extraction fusion module is introduced to obtain more multi-scale face feature information to further enhance the feature extraction capability of the codec network.A dual attention mechanism module is introduced to make the feature extraction detail information of the model more comprehensive and sufficient.Unlike traditional methods,this method uses a self-supervised method with a single image input to avoid high requirements on the dataset.To verify the effectiveness of the method,qualitative and quantitative also ablation experiments are conducted on the BFM,PhotoFace and CelebA face datasets.The experimental results show that the performance of the algorithm is improved in all relevant metrics compared to commonly used reconstruction algorithms.2.In order to increase the texture detail information and improve the network’s reconstruction capability for local details,this paper proposes a self-supervised deep learning network based on a singleview reconstruction aimed at improving the texture detail of 3D faces.The main idea of the method is to first generate an unfolded texture and a globally parameterized prior albedo using a priori modules based on the 3DMM model,and this priori information can be used to train a model with realistic rendering effects.A detail refinement module is then used to synthesize a final texture with high-frequency detail and integrity,allowing the network to better simulate real-world face morphology and texture features.Finally,extensive experiments are conducted on the BFM and CelebA datasets.The experimental results show that the proposed method has the ability to effectively enhance 3D texture detail reconstruction. |