Font Size: a A A

Research On Facial Expression Recognition Algorithm Based On Deep Learning

Posted on:2021-02-13Degree:MasterType:Thesis
Country:ChinaCandidate:Y ZhangFull Text:PDF
GTID:2518306119969889Subject:Control Engineering
Abstract/Summary:PDF Full Text Request
With the sustained development of the field of pattern recognition and artificial intelligence,face recognition technology has become increasingly mature.As a branch of face recognition technology,facial expression recognition has gradually become one of the research hotspots in the fields of computer vision,pattern recognition and human emotion understanding.Traditional facial expression recognition algorithms rely on artificially designed features,the algorithm design cycle is long and the recognition rate is limited.The neural network recognition algorithm relies on its own network architecture and models trained on dataset samples.After a large number of face expression dataset training,the recognition accuracy is higher than the traditional algorithm,but it will consume more computing resources and delay the training time of network model.Aiming at the above problems,this paper proposes multiple improved network model classification to boost the network recognition accuracy and accelerate the convergence of network models.The main research work is as follows:(1)An improved SR-VGG19 network model is proposed to recognize facial expression images.The SR-VGG19 network model is based on the VGG19 network model,and an optimized residual module is added.In the last layer of the convolutional feature map of the network,an improved regional regional network(Improved Regional Proposal Network(IRPN))is used to replace the sliding window to avoid repeated extraction of image features.At the same time,in order to improve the image feature expression ability and speed up the convergence of the network model,Introduce Spatial Pyramid Pooling(SPP)and add batch normalization(BN)between the convolutional layer and the fully connected layer.In the network model training stage,L2 normalization is introduced to constrain and limit the cross entropy loss function in order to avoid overfitting.For CK + dataset,in order to avoid overfitting of the network due to the CK + dataset being too small,this article expands the CK + dataset to achieve the purpose of dataset sample expansion.The classic FER2013 and CK + facial expression database were used to compare the algorithm with the top ten algorithms of the 2013 Kaggle competition and the facial expression recognition algorithm proposed in recent years,respectively,and verified the superiority of the algorithm in this paper.(2)An improved SENet network(Squeeze and Excitation Networks)is proposed.This network can not only increase the recognition accuracy of the network model,but also accelerate the convergence speed of the network model during the training process.The SENet network uses the Squeeze and Excitation strategies to model the correlation between the feature channels,and then implements the feature weight distribution through the Reweight operation.The Squeeze strategy compresses the spatial dimensions to transform each two-dimensional feature channel into a real number,and the output dimension of this real number is consistent with the number of input feature channels.Secondly,the Excitation strategy realizes the weight distribution of each feature channel by learning parameters w.finally,the Reweight operation weights the weight of the Excitation output channel by channel to the previous features of each channel,completing the improvement in the feature dimension.(3 Based on the ResNet18 and VGG19 network models,an improved SENet network is introduced,and SE-ResNet18 and SE-VGG19 network models are proposed.In the network model,model-based structure transfer,inductive transfer learning method,and PReLU activation function are introduced to accelerate the convergence speed of the network model during the training process.At the same time,in order to eliminate category imbalances and mine difficult sample information,a Focoal Loss loss function is added to the network.To avoid overfitting the network,Dropout and Batch Normalization(BN)strategies are applied to the network.Experimental results show that the accuracy and speed of the SE-ResNet18 and SE-VGG19 network models are better than those of the Network C?Modern deep CNNs,and Deep model network models.(4)Spatial Pyramid Pooling(SPP)is used to solve the problem that the VGG network model limits the input image size.Since the fully connected layer of the VGG19 network requires the input feature dimensions to remain constant,an SPP method is introduced in the VGG19 network to solve this problem.SPP pools the feature maps extracted from the last convolutional layer by using multiple different size windows,and combines the obtained results to obtain a fixed-length output.SPP is a multi-scale pooling,which can reflect the feature information of the image from different scales,improving the invariance of the scale;and the multi-window pooling operation can improve the accuracy of network recognition.To some extent,SPP can enhance the expression ability of image extraction features,and further improve the accuracy of facial expression recognition by VGG19 network model.
Keywords/Search Tags:Residual Network, VGG Network, Spatial Pyramid Pooling, Transfer Learning, Improved Regional Proposal Network
PDF Full Text Request
Related items