Font Size: a A A

Research On Facial Expression Recognition Method Based On Deep Learning

Posted on:2021-05-21Degree:MasterType:Thesis
Country:ChinaCandidate:X ZhangFull Text:PDF
GTID:2428330647967252Subject:Mechanical and electrical engineering
Abstract/Summary:PDF Full Text Request
With the rapid development of computer vision technology,facial and facial expression recognition technology is more and more widely used in a series of intelligent human-computer interaction scenarios such as fatigue driving detection,smart medical treatment,and remote network courses.In recent years,despite the continuous development and progress of facial expression recognition technology,there are still some difficult issues in this field that need to be resolved.First of all,affected by factors such as gender,age,and the different way people express the same type of expressions,the expressions of the same type vary greatly depending on the subject of the face image.In addition,different factors such as different deflection poses of the face in the picture,the face being occluded,and the different image acquisition environments all make the expression recognition task challenging.Therefore,solving or alleviating the above problems is the key to building a more efficient and accurate facial expression recognition system.The deep learning-based expression recognition system constructed in this paper includes two parts: face detection and alignment module and expression recognition network.The face detection and alignment module completes the detection of faces and facial feature points through the MTCNN network.Then achieving face alignment by affine transformation based on the detected facial feature points.Finally,histogram equalization is performed on the cropped face image to achieve the purpose of face image enhancement.After the above processing,the cropped face image is input to the expression recognition network to implement feature extraction and complete expression classification tasks.The main work and innovations of this paper are as follows:(1)Introduction of channel attention mechanismIn this paper,the commonly used CK + data set is firstly used to evaluate the model.First,combining the characteristics of the CK + data set with a small number of samples,based on the relatively shallow VGG11 network,the partial structure of the VGG11 is modified to obtain the main body of the expression recognition network base1_net.Then,for the phenomenon that the feature maps extracted by traditional convolutional neural networks cannot represent the importance of each feature channel,the above problem is solved by introducing a channel attention mechanism on the basis of base1_net.(2)Joint supervision training of multiple loss functionsAiming at the problem that the traditional softmax loss function cannot measure the intra-class distance of the same category,by introducing the center loss function,the expression recognition network is trained by joint supervision of the softmax loss and the center loss to achieve the purpose of reducing the intra-class distance and increasing the inter-class distance,thereby improving robustness of the model.The expression recognition network base1_net after introducing the channel attention mechanism and center loss is named v1_net in this paper.(3)Introduction of spatial attention mechanismIn order to improve the recognition accuracy of the model on the more challenging dataset FER2013,this paper makes further improvements to the expression recognition network v1_net.First,combining the characteristics that the data volume of the FER2013 dataset is significantly larger than the CK dataset,deepening the network layers of the main network base1_net in v1_net to obtain v2_net.Then,for the phenomenon that some samples in the FER2013 dataset failed to achieve face detection and alignment due to factors such as excessive face deflection angles or obstructed faces,a method of introducing a spatial attention mechanism in the v2_net model is proposed to alleviate such problems.(4)Optimization of training methodsThe stochastic weight averaging SWA training strategy is used to optimize the traditional SGD training method.By accumulating and averaging the searched solutions in the later stages of training the expression recognition network,the trained model has a stronger generalization ability,which can further improve the recognition accuracy of the expression recognition network on the test set.
Keywords/Search Tags:convolutional neural network, expression recognition, channel attention mechanism, center loss, spatial attention mechanism, stochastic weight averaging
PDF Full Text Request
Related items