Font Size: a A A

Research On Key Technologies Of Face Recognition Based On Pose Normalization

Posted on:2021-12-31Degree:DoctorType:Dissertation
Country:ChinaCandidate:X H ShaoFull Text:PDF
GTID:1488306305951949Subject:Computer application technology
Abstract/Summary:PDF Full Text Request
As one of the most intensively studied topics in computer vision,face recognition has made great progress in both academia and industry.The key technology modules in face recognition system,such as face detection,facial landmark detection,and feature extraction,however,are still vulnerable to different illuminations,poses,expressions,and occlusions in unconstrained scenarios.Compared with other factors,faces with different poses show more obvious discrepancies at the texture and feature levels.Face recognition based on pose normalization,which can explicitly reduce the impact of various facial poses on intra-class compactness and inter-class separability between extracted features by frontalizing original face images in advance,is a research focus and hotspot of face recognition technology.However,this approach still faces the following problems: 1)Pose normalization is sensitive to the accuracy of facial landmark detection that has a large room for improvement in complex scenarios.2)The image reconstruction capability of the pose normalization model is limited by the number and scale of training sets,and there are obvious distribution differences among different datasets.It is difficult for the deep model with a single constraint condition to complete unified training on multiple datasets.3)The modules of pose normalization and feature extraction modules are independent of each other,and the errors caused by the former will be directly transmitted to the latter,thus affecting the final feature discrimination.The above-mentioned problems have brought great challenges to the existing methods of face recognition based on pose normalization.This dissertation tries to optimize the key technologies of face recognition based on pose normalization from the following two steps: improving the performance of each algorithm module and enhancing the relationship between different modules.The main research achievements are listed as follows.1)Since existing facial landmark detection methods suffer from performance drop on face images with extremely large poses,this dissertation proposes an end-to-end network based on Landmark Heatmaps and Affinity Fields(LHAF)for face location and facial landmark detection.To represent unstructured information that arises due to pose and occlusion,LHAF allows global context to jointly learn the heatmaps of landmarks and the associations between any two parts,and thus detects basic landmarks directly on an image in a bottom-up manner.Furthermore,aiming at the multi-modal uncertainty of facial features of different poses,a deep convolutional regression network with multiple branches is introduced to distribute faces to the respective feature spaces according to their poses firstly and then accurately predict the accurate prediction of whole landmarks on faces.LHAF is evaluated on three popular datasets(300-W,AFLW,and Menpo).Experimental results show that the proposed method is robust to pose and occlusion variation and can improve the performance of facial landmark detection in complex scenarios.2)Since most regression-based facial landmark detection algorithms are sensitive to face initialization variation and annotation inconsistency,this dissertation proposes another facial landmark detection method based on Deep Progressive Reinitialization and Error-driven Learning(DPREL).On one hand,by introducing supervised spatial transformer networks in a progressively coarse-to-fine regression structure,DPREL learns the best transformation parameters progressively for status reinitialization according to the input whole and parts of face images,to simplify inference in the subsequent regression subnetworks.On the other hand,this dissertation proposes an adaptive landmark-weighted loss function,which adjusts the importance of different landmarks according to their prediction errors during the training procedure,to avoid overfitting brought by the inconsistencies in manual landmark annotation.The performances of DPREL are investigated on different popular datasets(300-W,AFLW,COFW,and WFLW)including faces with various poses,illuminations,and expressions.The experimental results show that the proposed method not only effectively enhances the robustness to different kinds of unstable initializations,but also significantly reduces the adverse impact of annotation inconsistencies on model training.The final accuracy evaluation of the DPREL models outperforms other existing methods.3)To enhance the robustness of pose normalization to the errors brought by facial landmark detection,this dissertation presents a novel approach of Flexible Pose Normalization(FPN)to accomplish face recognition.Given a face image with an arbitrary pose,FPN firstly maps it to a preset frontal face template in 2D space before the3 D geometric transformation,to simplify the intrinsic parameter estimation of camera calibration.In the matching process of 2D facial landmarks and 3D face model,the proposed method retains the landmarks which best match the undeformable 3D model by exploring a flexible camera calibration method based on RANSAC and facial unique characters.Thus the image quality of pose normalization is insensitive to landmark outliers.The FPN method is evaluated on the popular dataset LFW under two protocols.The experimental results demonstrate that the proposed method can effectively reduce intra-class differences between the faces with the same identity and different poses,and improve multi-view face recognition in complex scenarios.4)Existing pose normalization methods are limited by the single-domain training dataset and independent with feature extraction,this dissertation proposes a multi-view face recognition method based on a deep Well-Advised Pose Normalization Network(WAPNN).Through multi-domain learning with an adaptive weighted loss objective,WAPNN bridges the distribution gap between different datasets and implement an endto-end facial pose normalization network driven directly by the recognition task.Meanwhile,by introducing feature linear fusion based on image quality perception,the proposed method can enhance the descriptive and distinguish capacities of extracted features.The performances of WAPNN are evaluated on four popular datasets(Multi-PIE,CFP,IJB-A,and LFW).The results show that the proposed method effectively recovers frontal faces with good-quality textures and high identity preserving,and significantly improves the performance of multi-view face recognition in both constrained and unconstrained scenarios.
Keywords/Search Tags:Facial Landmark Detection, Spatial Transformer Networks, Error-driven Learning, Facial Pose Normalization, Multi-domain Learning, Image Quality Perception, Multi-view Face Recognition
PDF Full Text Request
Related items