Font Size: a A A

Facial Expression Recognition Based On The Convolutional Neural Network And Attention Mechanism

Posted on:2024-08-20Degree:MasterType:Thesis
Country:ChinaCandidate:C Y ZhouFull Text:PDF
GTID:2568307124460304Subject:Electronic information
Abstract/Summary:PDF Full Text Request
Facial expression recognition aims to detect crucial hidden information conveyed unintentionally through facial expressions during interpersonal communication.Its advancement contributes to improving the practicality or service quality in various fields such as human-machine interaction,augmented reality,fatigue detection,health care,criminal investigation,and virtual reality.Currently,while deep learning has improved the feature extraction ability of models,it has also reduced their universality.Simultaneously,the complexity of facial expressions,high environmental interference,uneven distribution of data samples,and lack of high-quality datasets make it challenging for algorithms to capture the essence of expressions,resulting in limited practical generalization ability.Based on these research challenges and practical demands,in this thesis,we conduct studies on facial expression recognition based on the convolutional neural network and attention mechanism,and the primary research is as follows.(1)To address the common issue of limited computational resources on terminal devices in real-world scenarios,we propose a lightweight neighbor convolutional neural network that balances the universality and recognition accuracy of FER models through multi-scale information exchange.Firstly,the network adopts multi-scale convolutional receptive fields to receive input data and perform feature extraction and fusion.Secondly,the neighbor blocks connect shallow,intermediate,and high-dimensional abstract features to facilitate information exchange between the network layers.Finally,the last pooling layer adopts a weighted fusion strategy of maximum/average pooling to effectively reduce the loss caused by aggressive feature compression.To validate the effectiveness of the network,experiments were conducted on the FER2013 and CEDB,and performance comparisons were compared to algorithms such as MobileNetV2 and ShuffleNetV2.The network has slightly more parameters than Resnet18 but significantly fewer than VGG19.The test accuracy on the FER2013 and CEDB reaches 73.14% and90.57% respectively,with better recognition rates on the FER2013 dataset compared to Resnet18 and VGG19,effectively achieving a balance between universality and recognition accuracy.(2)To address the challenge of facial expression complexity,we propose a robust network,known as the Neighbor Attention ResidualNetwork(NA-Resnet),which emphasizes the key regions of the face that are more more recognizable and critical,such as the forehead,eyes,and mouth.This network incorporates two optimized attention blocks to enhance its performance.Firstly,the network is divided into a main branch and a neighboring branch,with a residual network as the backbone.Secondly,optimized attention modules are embedded within the residual blocks of the main branch to focus on key regions.Finally,to effectively achieve information exchange,each layer of the network is adjacently connected through the neighboring blocks of the neighboring branches.Without loading pretrained parameters,the network achieves an average accuracy of 96.06% with a low standard deviation of 2.9% in ten-fold cross-validation on the CK+ dataset of seven types of expressions,and an accuracy of 85.63% and 75.59%on the RAF-DB and FER2013 datasets,respectively,outperforming Res MaskingNet,DACL,and FER-VT and other algorithms.(3)To address the contradiction between the lack of high-quality labeled datasets and the demand for high generalization ability and high accuracy of facial expression recognition models in the real world,we propose an efficient facial expression recognition network based on MTCNN and transfer learning with EfficientNetV2.Firstly,the network utilizes the MTCNN network for data preprocessing on the AffectNet dataset,accurately detecting and cropping the facial regions.Next,we transfer the pretrained parameters learned on the ImageNet dataset to EfficientNetV2 and fine-tune it on the AffectNet dataset.Finally,the parameters learned on the AffectNet dataset are further transferred and fine-tuned on the RAF-DB dataset.The recognition accuracy on the AffectNet and RAF-DB datasets reaches 57.9% and 89.51%,respectively,achieving outstanding performance in facial expression recognition.
Keywords/Search Tags:facial expression recognition, convolutional neural network, neighbor block, optimized attention module, transfer learning
PDF Full Text Request
Related items