Font Size: a A A

Research On The Technology Of Deep Learning Based Face Image Recognition

Posted on:2020-12-21Degree:DoctorType:Dissertation
Country:ChinaCandidate:X F LiuFull Text:PDF
GTID:1368330572971076Subject:Mechanical and electrical engineering
Abstract/Summary:PDF Full Text Request
With the development of artificial intelligence and computer vision technology,making a machine to understand the image has become an important research topic.Among them,face image analysis is an active branch because of its rich information and broad application prospects.Face image recognition refers to automatically inferring the identity,expression,and age,gender etc.,by analyzing a person's facial image,video,or a collection of pictures and videos.Face images usually combine various factors including identity,expression,age,gender,illumination,angle,etc.How to extract a feature representation related to the recognition task(for example,identity information corresponding to face recognition,expression information corresponding to expression recognition)is currently the main research direction of deep learning pattern recognition algorithms.This dissertation further explores how to introduce the invariant prior knowledge of certain unrelated attributes,so that the extracted features for the main recognition task is more discriminative and has generalization ability.For example,features of expression recognition should be robust to identity changes,while features of identity recognition should be robust to changes of expression,lighting,makeup etc.Target on this goal and based on two important face related applications,i.e.,facial expression recognition and face identity recognition,this paper proposes an adaptive deep metric learning and a feature-level adversarial training method to disentangle the expression and identity information,or identity and some attributes information.Besides,a deep reinforcement learning algorithm is introduced for set-based face recognition.The main contributions of this dissertation can be summarized as:1.Facial expression is one of the most expressive way for humans to convey their emotional state.Adaptive metric learning is introduced to remove identity information to extract a more pure facial(picture/video)expression feature.The second chapter proposes a novel deep metric learning algorithm,i.e.,(N+M)tuple cluster loss,which not only alleviates the difficulty of threshold validation and anchor selection,but also reduces the computational burden of deep metric learning.Its threshold parameters can be learned adaptively.By properly selecting the negative samples as other expression images of the same person,it can clearly eliminate the identity information,and the mining of the difficult sample can be efficiently.Experiments on the CK+,MMI and SFEW datasets show that the algorithm can effectively improve the recognition accuracy by using the identity tags usually exist in the expression recognition dataset.2.Considering that the neutral expression of the same person could be the most important reference sample,but may not exist in several datasets.Chapter 3 proposes a hard negative generation scheme with radial metric learning.The identity information can be removed by comparing the query picture with the same identity neutral face reference image generated based on it.The hard negative generation is based on the pixel-level adversarial-generation network to remove the attributes such as expression and pose to inference the identity-invariant normalized face.Experiments on the CK+,MMI and SFEW datasets evidence that hard negative generation can take advantage of frontal neutral face images in the face recognition dataset which much larger than the expression recognition datasets to incorporate the priori knowledge of the reference image.It not only enhances the recognition effect,but also largely reduces the training time compared to traditional metric learning.3.Chapter 4 systematically summarizes the relationships between the various factors in the face image and defines them on a more general case of multi-label data.A disentangle network based on feature-level adversarial training is proposed.The input picture is decomposed into three features that are discriminative feature for the main recognition task(such as identification),which is expected to be robust to the semantic attributes(such as expressive,illumination,etc.),and the unnamed or unquantifiable factors(such as backgrounds)that are expected to be removed.The three complement each other and are independent of each other.4.With the explosive growth of digital media content,the use of a collection of pictures and videos for verification/identification is more align with the actual biometric applications.For example,several query pictures or video frames of a person can be obtained from multiple cameras,and the gallary pictures or video frames can also be composed of various history ID photos and taken from the other places or published in social media.Compared with comparing the similarity of a single image or video,this setting can provide more information,but it also brings challenges to information fusion.The fifth chapter proposes an algorithm based on deep reinforcement learning to explore the importance and complementarity among images.The effectiveness of the algorithm is verified on the IJB-A/B/C series set-based face recognition data set,the video-based Celebrity-1000 data set,and the pedestrian re-identification task.
Keywords/Search Tags:Facial expression recognition, Face recognition, Metric learning, Reinforcement learning, Adversarial generation network
PDF Full Text Request
Related items