| In recent years,artificial intelligence technology represented by deep learning has brought great changes to various industries,especially in the field of computer vision.The facial landmarks describe the position of important organs of the face and represent the facial geometric structure,so facial landmark detection is an important foundation for related applications.Although the accuracy of facial landmark detection has effectively improved by deep learning,the detection performance would be seriously degraded when the face is affected by complex environments such as occlusion,illumination,and expression,etc.This thesis mainly studies the facial landmark detection problem by using deep learning to improve the detection performance,and further studies 3D facial reconstruction based on facial landmark detection.First of all,a new stacked attention hourglass network(SAHN)is proposed in this thesis,where attention mechanism is introduced based on the stacked hourglass network(SHN).SAHN uses the resulting heatmap for regression.Basically,in the design of SAHN,a spatial attention residual(SAR)unit is presented such that relevant areas of facial landmarks are specially emphasized and essential information of different scales can be well extracted.At the same time,a channel attention branch(CAB)is introduced to better guide the next-level hourglass network for feature extraction.The traditional SHN is usually composed of four hourglass networks,while the SAHN proposed in this thesis can achieve satisfactory performance by only stacking two hourglass networks due to the introduction of SAR unit and CAB block,and the amount of parameters and calculation are greatly reduced.Then,in order to further improve the accuracy of facial landmark detection in complex environments such as occlusion,illumination,and expression,etc,the softargmax and differentiable spatial to numerical transform(DNST)methods are used to obtain the landmark coordinates from the heatmap.A variable robustness(VR)loss function is presented for the training of the proposed SAHN in an end-to-end regression method.During training,the shape of the VR loss function is changed by adjusting the parameters such that the influence of outliers is efficiently reduced in the later stage of training.Therefore,facial landmarks subject to uncertain circumstances would be well predicted with the help of the VR loss,and the robustness of facial landmark detection is guaranteed.Finally,facial landmark detection is applied to 3D facial reconstruction.In this thesis,the VRN-SAHN is proposed by further improving the volumetric regression network(VRN)based on SAHN.In VRN-SAHN,the voxel representation of the 3D face is obtained by the 2D face image and the facial landmark heatmap.The final 3D point cloud is obtained through the marching cubes(MC)algorithm and the iterative closest point(ICP)algorithm.In this thesis,SAHN is trained and tested on three public datasets:300W,WFLW and COFW.The experimental results illustrate the advantages of the SAR unit,the CAB block and the VR loss function.Compared with some existing algorithms,the method proposed in this thesis achieves better results on the three datasets.The NME dropped to 4.23%on the most complex WFLW dataset.At the same time,the experimental results show that facial landmark detection plays an important role in 3D facial reconstruction,and VRN-SAHN can effectively improve the performance of 3D facial reconstruction. |