When human beings express emotions,the main way of expression is facial expression.The study of facial expression recognition is one of the important research directions of affective computing research,and its development can effectively improve the experience and intelligence of human-computer intelligent interaction systems,which has important significance and application value in many fields such as traffic safety,personalized recommendation,advertising and service robot.Most of the early research on facial expression recognition used manual feature extraction,and its extraction steps were complicated.With the development of deep learning techniques and dataset size,traditional Facial Expression Recognition(FER)research methods are gradually transformed into deep learning methods.However,the models of these deep learning methods often have a large number of parameters and high computational complexity with low computational efficiency,which makes them quite difficult to deploy in practical applications.Therefore,more and more developers have focused on researching lightweight FER models in recent years.Most of the existing lightweight FER methods tend to consider only the computational cost and perform poorly in terms of recognition performance.Meanwhile,most of the works only use indirect metrics such as the number of parameters and computation to evaluate the models,which may lead to large errors between theoretical and practical efficiencies.In order that FER models can run efficiently on mobile or edge devices,it is important and valuable to study lightweight FER methods that achieve a good balance of performance and efficiency for applications.The thesis conducts relevant research on this issue,and the content and results of the work done are summarized in the following three points:(1)A lightweight facial expression recognition network is constructed based on the Ghost Net model,and a lightweight Coordinate Attention(CA)module is introduced into the network.Fine-tuning on Ghost Net,the channel attention module is removed and CA module is introduced to capture the channel and spatial information in the feature map.The results of the ablation experiments indicate that the introduction of the CA module on top of Ghost Net improves the recognition performance of the network with little increase in computational cost.At the same time,the overall number of parameters of the network increases by only 0.01 million,MAdds(Multiply-Accumulate Operations)increases by 0.39 million,and latency increases by 1.78 milliseconds.The proposed network,however,has a recognition accuracy of 82.72% on the RAF-DB dataset,which is an improvement of 0.42% compared to that without the introduction of the CA module.The recognition accuracy on the Affec Net-7 dataset is 62.14%,which is 0.33% better than without the introduction of the CA attention module.The network recognition performance is improved after introducing CA attention,while the proposed network has high theoretical and practical efficiency overall.(2)The recognition performance is further improved by training the network with dynamic deep mutual learning method.The thesis further improves the deep mutual learning strategy by using combination coefficients,exponential decay functions and segmentation functions to vary the combined ratio of these two losses,thus gradually enhancing the share of label distribution information to further improve the performance of the small network.Using the segmentation function,the best results were obtained by a phased approach of increasing the proportion of label distribution loss to the total loss during training.Relative to the best results of the original deep mutual learning method,the recognition accuracy was improved by 0.97% on RAF-DB and 1.11% on Affec Net-7.Recognition accuracy of 86.92%,63.75%,and 59.92% were achieved on the RAF-DB,Affect Net-7,and Affect Net-8 datasets,respectively.Since the training strategy does not change the network structure and does not increase the computational cost of the network during inference,the number of parameters,computation and actual inference delay of the proposed network are lower than those of other lightweight FER methods,demonstrating that the method achieves a good balance between performance and efficiency.(3)An Android application for facial expression recognition is developed.In order to run the deep learning model on mobile devices,the MNN(Mobile Neural Network)inference engine is compiled into the Android device,and the best Py Torch model trained on the RAF-DB dataset is selected and converted to the MNN model.Finally,the converted MNN model is deployed into an Android application,and an Android application for facial expression recognition is developed using Android Studio,which implements expression recognitipaperon based on static pictures and camera signals,further verifying the effectiveness and efficiency of the model proposed in the thesis,and providing a reference for the subsequent application of face expression recognition algorithms. |