Facial expression recognition technology confirms people’s psychological state by recognizing and outputting the expression classification in a given static image or dynamic video sequence,in order to realize the recognition and understanding of human expression by computer,which fundamentally changes the relationship between human and computer,enabling computers to provide more advanced services to humans.The features extracted by manual design used in traditional methods usually lack the generalization modeling ability of lighting,posture and other factors.It may have a good effect on some small data sets from laboratory,but the application effect in real scenes is poor.However,deep neural network is extraordinary in feature extraction and high-dimensional data processing.Therefore,facial expression recognition is presently usually based on neural network.The main research contents of this paper are as follows:(1)According to FACS,different regions of face play different roles in recognizing facial expressions,and a network framework based on attention mechanism is put forward,which is made up of modified vgg19 and CBAM,in which CBAM consists of channel attention module and spatial attention module,in series,so as to adaptively adjust the feature map in channel and space in turn,namely giving higher weights to channels and spatial regions that play a key role in expression recognition,and lower weights to unimportant channels and regions.Experiments indicate that the accuracy of the basic network is significantly improved after adding CBAM.(2)In view of the low accuracy of facial expression recognition caused by the skew of face posture in the real environment,the length features and angle features related to facial expression can be obtained by locating and measuring facial landmarks when the expression changes,both of which are invariant to rotation.Therefore,a network framework based on the length feature of landmarks and network framework based on the angle feature of landmarks are proposed respectively.Firstly,68 feature points are calibrated on the face image,and then according to the change law of the face under different expressions,the Euler distance between these landmarks is calculated to obtain the length feature of 2278 dimensions,the shape information between specific landmarksis is calculated to obtain the angle feature,and finally the length feature and angle feature are normalized and input into the corresponding full connection layer.The experimental results show that the two network models have good robustness,but the network framework based on the length feature of landmarks has better performance.(3)In view of the different functions of different regions of the face and the low accuracy of facial expression recognition caused by the skew of face posture,combining the network framework based on attention mechanism and the network framework based on the length feature of landmarks,a network framework based on attention mechanism and the length feature of facial landmarks is proposed,which is an end-to-end network structure.The whole network is trained by joint fine-tuning.The final results show that the recognition effect of the double branch network trained by joint fine-tuning performs better than that of any single branch network.The network model proposed in this paper has been tested on JAFFE,CK+ and Fer Plus respectively,and the highest accuracy is 93.48%,96.12% and 84.73% respectively,which indicates that the algorithm proposed in this paper is advanced and robust. |