After deep learning and artificial intelligence technology are getting better and better,human-machine interaction has been studied in different fields,facial expression detection has emerged as a new research focus for enhancing user experience.With the emergence of deep learning methods,traditional manual feature extraction methods have been replaced,and facial expression recognition technology has developed rapidly.The addition of residual networks enhanced facial expression recognition performance further,however,the current research focuses on how to create superior residual networks to extract necessary features more effectively to improve the accuracy of face emotion identification.To achieve the goal of improving the network recognition rate,this thesis improves upon the Res Ne Xt50 network and the Efficient Net V2 network.The main research work is as follows:1.To address the issue that the current general architecture-based deep learning expression identification approach loses crucial information during the convolution process,an improved Res Ne Xt50 network(named AC-SP-Res Ne Xt50)is proposed.Based on the Res Ne Xt50 architecture,added a multiscale feature fusion layer that uses four convolution kernels of different sizes to capture more texture information from the original image.Additionally,Soft Pool is used as the network’s pooling layer to retain more feature information,and asymmetric convolution modules are used to construct asymmetric residual structures to enhance feature extraction.In comparison to the standard Res Ne Xt50 network,the upgraded Res Ne Xt50 network suggested in this research exhibits greater facial expression recognition accuracy on the CK+ and Jaffe datasets,according to an analysis of the experimental results.2.Attempting to address the issue that the improved Res Ne Xt50 network does not pay sufficient attention to expression features when extracting features,an improved Efficient Net V2network(named Efficient Ne Xt)is proposed.Firstly,based on the characteristics of the Efficient Net V2 network,the width of the input image is changed,from 224x224 to 300x300,to improve the network performance from the width of the input image.And then,in order to decrease the network’s interference from unimportant information,increase the network’s focus on the expression feature information during the feature extraction,and enhance the network’s overall performance,the spatial attention mechanism has been added.Finally,the proven and effective improved Res Ne Xt50 residual network is added to the network,and an Efficient Net V2 network combining multi-scale feature fusion,Soft Pool pooling,asymmetric residual and spatial attention mechanism is proposed.The experimental findings demonstrate that the enhanced Efficient Net V2 technique put forward in this study,as compared to the original Efficient Net V2 network,has a higher level of accuracy in recognizing facial expressions on the CK+,Jaffe,and FER2013 datasets.3.In order to make facial expression recognition better applied in daily life,this thesis designs and develops facial expression recognition software.The software system is developed based on Android platform and has three main functions: image acquisition,face detection,and expression recognition.Image acquisition is achieved by calling the native camera to capture facial images,then the Face Detector class in Android is used to detect the faces in the images,and finally,the trained Efficient Ne Xt residual network is used for expression recognition.After testing,the software can successfully recognize 7 facial expressions,which meet the accuracy requirements of the software,and have a certain degree of stability and security,realizing the application of expression recognition in practical scenarios. |