| As the most direct carrier of human emotion,the facial expression is one of the key research objects in the field of emotion analysis.To improve the performance of the human-computer interaction system for emotion perception,it is of great significance to realize accurate facial expression recognition.For this reason,aiming at the shortcomings of the existing models,the facial expression recognition methods orient to image and video are explored in this thesis.The specific contents are as follows.1.For image-oriented facial expression recognition,the convolutional neural network based on attention mechanism and multi-path consistent fusion is studied.Firstly,a convolution neural network based on attention mechanism is constructed to extract the convolutional features of facial emotional regions around eyebrows,eyes,nose,and mouth.On this basis,the convolutional features are weighted automatically by the attention mechanism according to their importance to the classification.In this way,the proportion of regions with high contribution are increased.Secondly,the regional features are linearly integrated by multi-path consistent fusion module.Moreover,the designed loss function is used to ensure the consistency between regions and eliminate the effect of confusing regions.Finally,the recognition accuracies on CK+ database and RAF-DB database are 97.44% and 85.43%,respectively.The simulation results show the superiority of the proposed method.2.For video-oriented facial expression recognition,a deep spatiotemporal network based on multi-mode fusion is studied.Firstly,the video is divided into multiple segments in the data preprocessing stage.To reduce the interference of redundant information and improve the perception of facial expression,the optical flow and the facial landmark trajectory are used to describe the facial movement.Secondly,the residual neural network based on(2+1)D convolutions and the convolution neural network based on attention are constructed to extract the spatiotemporal features.Then,the multi-mode fusion module is introduced to synthesize the emotional information and realize the final classification.Finally,the recognition accuracies on CK+ database and MMI database are 98.80% and 79.33%,respectively.The simulation results show the superiority of the proposed method. |