Font Size: a A A

Research On Multi-feature Fusion Expression Recognition Method Based On Header Information

Posted on:2021-02-24Degree:MasterType:Thesis
Country:ChinaCandidate:C Q YinFull Text:PDF
GTID:2518306047992359Subject:Master of Engineering
Abstract/Summary:PDF Full Text Request
As AI technology begins to transform from technology realization layer to application realization layer,we hope that AI can understand human emotion and respond to it.Emotion computing can judge human emotion better through pictures,sounds,texts and other ways,so that the machine can understand the user's intention,make corresponding understanding and response or simulate human emotion in a certain application environment.This will promote the development of visual understanding,human-computer interaction technology in the scene on the application level,and will also improve the basic algorithm level of artificial intelligence technology.The first step of emotion calculation is emotion recognition.The research forms of emotion recognition are mainly divided into text,voice,facial expression,body posture,physiological signal of human body,etc.Among them,facial expression plays a very important role in daily communication.It is an intuitive reflection of human heart,and it can transmit rich information,such as speaker's emotion,psychological feeling,attitude and intention,and so on.Therefore,it is more intuitive and profitable to choose facial expression as the research method of emotion recognition.Facial expression recognition has always been a hot topic in the field of computer vision.At present,most people mainly study the problem of facial expression recognition in complex environment.Although the research on this problem based on deep learning method has shown a good effect,there is few breakthrough progresses at present.The main problems are: data of spontaneous facial expression in natural environment is difficult to collect,the annotation of data sets is subjective,and the network model for expression recognition is too single.Based on the challenges above,this paper overcomes the difficulties in three aspects: the collection of facial expression recognition data,the improvement of facial expression recognition network model and multi-feature,multi-modal fusion.First of all,this paper introduces the commonly used expression recognition data sets,evaluates the advantages and disadvantages of each data set synthetically,and against these disadvantages,establishes a new multi-modal expression recognition data set HEU-Emotion under the natural environment,describes the process of establishing this data set,including expression data collection,processing and annotation.The data distribution and related attributes of the dataset are introduced in detail.Up to now,this dataset is a multi-modal expression recognition dataset in the natural environment with the largest amount of data.Secondly,aiming at the problem that we can't learn features from facial key points in Euclidean space at present,this paper transfers the task of facial expression recognition to nonEuclidean space.First,the key points of facial expression are extracted,and the key points are used as input to construct the graph structure.The graph convolution neural network is used to perform the graph convolution operation on the constructed graph structure,and then the extracted graph features are classified.This graph expression recognition model provides a new idea for facial expression recognition based on the key point features.To solve the problem of unbalanced recognition result caused by unbalanced expression recognition data set,the paper proposes to build a double branch and multi branch convolution neural network,which uses separate branches to predict some expressions with less data and difficult to train,and changes the loss function according to the characteristics of the data.It is clear that the recognition effect of the expression recognition model based on the multi branch convolution network is better than that of the single branch network.Finally,this paper combines the above facial key point features extracted by graph convolution network and the video sequence features extracted by multi branch convolution network for multi feature fusion.In addition,the multi-modal fusion of voice features and facial features can achieve the purpose of complementing each other and show more abundant information.The expression of human emotion is the process of multi-modal expression.We show our inner emotions through the changes of expression,voice tone and posture.Therefore,the integration of these modes really conforms to the logic of human emotion expression in daily life.The final results show that the multi-modal fusion method is better than the single modal recognition,and this will become the main research direction of facial expression recognition in the future.
Keywords/Search Tags:Graph Convolution Network, Deep Learning, Feature Fusion, Expression Recognition
PDF Full Text Request
Related items