Font Size: a A A

Real-world Facial Expression Analysis Based On Deep Learning

Posted on:2022-09-13Degree:DoctorType:Dissertation
Country:ChinaCandidate:S LiFull Text:PDF
GTID:1488306326479354Subject:Information and Communication Engineering
Abstract/Summary:PDF Full Text Request
Facial expression recognition is designed to identify human emotions from facial characteristics.Comprehending different categories of facial expressions plays a great role in the design of computer-based systems to analyze people's cognitive-affective states automatically and take measures accordingly.Automatic facial expression recognition has become a hot research topic in the field of affective computing and computer vision due to its potential applications in social media analysis and human-computer interaction.Classical prototype expression theory holds that humans from different regions and races have seven basic expressions that are consistent.Early research on facial expression recognition was mainly limited on this theory and classified images into seven basic expressions under laboratory-controlled conditions within a single database.However,with the popularity of the Internet and the development of deep learning technology,the focus has gradually shifted from the study of laboratory-controlled databases to more challenging real-world scenarios.Facial expression in the real world contains a variety of scenes and dynamic changes.Environmental factors,such as illumination,head pose and occlusion,and cultural factors,such as ethnic regions and social trends,couple with expressions nonlinearly,which brings great challenges to expression recognition.A system that can achieve nearly perfect performance in ideal conditions may behave poorly in more challenging ones.Moreover,the prototype expression theory fails to capture the complex and subtle emotions that people express in their daily lives.The current research field still lacks large-scale facial expression datasets with accurate annotations and the corresponding algorithm for recognizing real-world expressions.To deal with these problems,this dissertation first extends basic expressions to compound and blended expressions,and constructs corresponding large-scale real-world facial expression datasets.The expression manifold analysis theory for complex emotions is established,and a new method for cross-domain expression transfer learning is proposed,which forms a series of work ranging from basic data definition,accurate recognition to cross-scene adaptation.The main content and contributions of this dissertation are as follows:(1)Aiming at the strong subjectivity of expression annotations,this research employs crowdsourcing to annotate the massive images collected from the Internet,during which each image can be independently annotated enough times.Then,an expectation-maximization algorithm is developed to evaluate the professional level of each annotator and filter out the noisy labels.As a result,compound and blended expressions were dug out from the accurate label distributions.Finally,a novel real-world facial expression database RAFDB,which contains seven basic expressions and twelve compound expressions and a novel multi-label facial expression database RAF-ML,which contains multiple blended expressions,are constructed,which broaden the definition of classic prototype expressions and provide a data basis for the recognition of real-world facial expressions.(2)To address the interference factors unrelated to expressions in the wild,we propose a new deep locality-preserving convolutional neural network(DLPCNN)method that aims to enhance the discriminative power of deep features by preserving the locality closeness while maximizing the inter-class scatter.Specifically,we adapt the seminal idea of local neighbors from shallow learning to a new deep feature learning approach by creating a locality preserving loss that aims to pull the locally neighboring faces of the same class together.Jointly trained with the classical softmax loss which forces different classes to stay apart,locality preserving loss drives the intra-class local clusters of each class to become compact.Experiments on RAF-DB and other different databases show that the proposed DLP-CNN outperforms the state-of-the-art handcrafted features and deep learning-based methods for expression recognition in the wild.(3)Focusing on the ambiguity and continuity of blended expressions,we propose a new deep manifold learning network,called Deep Bi-Manifold CNN(DBM-CNN),to learn the discriminative feature for multi-label expressions by jointly preserving the local affinity of deep features and the manifold structures of emotion labels.DBM-CNN simultaneously and efficiently considers crowd-sourced label information and feature compactness in the low-dimensional manifolds by adding a new loss layer,bi-manifold loss.Jointly trained with the cross-entropy loss which forces images with different labels to stay apart,the bi-manifold loss drives the locally neighboring faces sharing the similar intensity distribution to become coherent.Extensive experiments on the RAF-ML and other diverse databases show that the deep manifold feature is not only superior in multi-label expression recognition in the wild,but also captures the elemental and generic components that are effective for a wide range of expression recognition tasks.(4)Due to the construction bias and annotator's emotion perception bias,different facial expression databases have formed different expression recognition scenarios.To further improve the ability of cross-domain expression recognition,we propose a novel deep Emotion-Conditional Adaption Network(ECAN)based on transfer learning.The proposed network can match not only the marginal distribution but also the class-conditional distribution across domains by exploring the underlying label information of the target dataset.Moreover,the largely ignored expression class distribution bias across domains is also addressed by importing a learnable class-wise weighted parameter,so that the training and testing domains can share similar class distribution.Extensive cross-database experiments on both lab-controlled datasets and real-world databases demonstrate that the proposed ECAN shows the most promise in generalizability across domains,which can yield competitive performances across various cross-dataset facial expression recognition tasks and outperform the state-of-the-art methods.To sum up,this dissertation has carried out real-world facial expression analysis on a series of problems including reliable image annotation,complex expression recognition and cross-domain emotion adaptation.Firstly,the categories of real-world expressions are broadened.Then,based on this data,the idea of manifold learning and transfer learning are integrated in the deep learning framework for the complex and subtle characteristics of facial expressions.Experimental results on various datasets show that the proposed methods can effectively improve the discriminant ability and generalization ability of real-world facial expression recognition.
Keywords/Search Tags:facial expression recognition, real world, facial expression database, deep learning, manifold learning, transfer learning
PDF Full Text Request
Related items