Font Size: a A A

Research On Facial Expression Recognition Method Based On Deep Neural Network

Posted on:2021-04-17Degree:MasterType:Thesis
Country:ChinaCandidate:X Y ZhangFull Text:PDF
GTID:2518306047499484Subject:Control Science and Engineering
Abstract/Summary:PDF Full Text Request
Machine learning and deep neural networks are rapidly developing,and high-computing devices are also widely available.Face recognition-related technology have experienced unprecedented development,the market development and specific applications of this technology are in a rapid growth stage as well.In recent years,facial expression recognition has become a hot topic in academic and industrial research.It has received extensive attention from many researchers in the fields of human-computer interaction,medical intervention,national security,information communication,and autonomous driving.Although deep networks are becoming more popular and applicable,there are still many challenges in facial expression recognition at this stage.For example,the size of the expression data set is insufficient;the form of the expression data is too single;the expression strength under natural conditions is different,which makes weak expressions far different in deep expression recognition networks with excellent general performance;single-modal expression recognition cannot completely represent the most true human emotion expression.Based on the above issues,this article explores and adjusts the overall expression recognition structure in three aspects: data,recognition framework,and fusion strategy.The research aims to improve the challenges mentioned above.First,this article created a new multi-modal expression dataset based on natural environment.Adjustments have been made for issues such as single collection conditions of the publicly available data sets,uneven distribution of subjects,and excessively uniform expression intensity.We crawl videos and manually edit video clips on the Internet to expand the order of data samples.Then the original data is detected,cut,and identified,and the facial expression data is manually labeled by crowdsourcing.Finally,we compare the labeled expression dataset with the commonly used standard expression datasets.The recognition accuracy of the HEUEmotion created in this paper is lower than other datasets under the same recognition network.It shows that the data set in this paper is more complicated.Also,with its unique multi-modal properties,the data set in this paper is verified to be more practical than other data sets through baseline experiments.Secondly,on the basis of obtaining more complex expression samples,we built an expression recognition model suitable for dynamic sequences based on the feature fusion strategy.We improved the definition of the loss function,and combined the optimal loss AMSoftmax loss in the face recognition field with the Island loss in the expression recognition field to form a new loss function,which can more effectively reduce intra-class differences and expand Distance between classes.According to the creation process of the HEU-Emotion dataset,it can be seen that the expression information under natural conditions is uneven,and the proportion of weak expressions is significantly higher than that of strong expressions.So based on the input concept of strong and weak expressions,this paper improves the existing PPDN network structure and replaces the network structure in the recognition phase with the optimal recognition model in the above experiment.In order to further improve the recognition effect of the expression network,the attention mechanism is applied in this paper.The SE module and the CBAM module are connected in series in the channel domain and the spatial domain.The improvements in the above experiments are based on the exploration and verification in a single modality,so the model fusion is based on multimodal expression recognition at the end of this article.In order to verify that the three different modalities of pose,sound and face together form an emotional expression.The recognition effect of information and multimodality complementary was verified by decision-level model fusion and proved to be significantly better than that of single-modality.After training on the high-complexity HEUEmotion dataset created in this paper,the final experimental results show that the expression recognition network fused by multi-modal models is more robust to dynamic expression recognition under natural conditions and has a higher accuracy rate.It has promoted the application development of expression recognition.
Keywords/Search Tags:Deep learning, Feature Fusion, Expression recognition, Attention Mechanism
PDF Full Text Request
Related items