Font Size: a A A

Video Facial Emotion Recognition Based On Deep Learning

Posted on:2021-03-04Degree:MasterType:Thesis
Country:ChinaCandidate:Y GaoFull Text:PDF
GTID:2428330614960371Subject:Computer application technology
Abstract/Summary:PDF Full Text Request
As a way of expressing emotions,expressions are more natural and direct.they play a very important role in our daily life.In some occasions,expressions are usually more accurate than words to expression true feelings.with the development of computer technology,expression recognition based on video sequeces has got more and more attention.The generation of expression is a dynamically changing process.In a single static image,the characteristic information contained is usually limited,however the expressions presented in the video sequece provide more sufficient context information,and the mechanism of expression generation is further satisfied,the information provided in the research will be more abundant.The research center of this article focuses on video sequence,in order to efficiently extract the spatio-temporal feature from the sequence,some algorithms are proposed,the specific content in this dissertation is as follows:(1)A weighted two-stream network model is proposed.When using traditional methods to extract facial expression features,the features extracted by the algorithm are usually set in a fixed space,and the robust performance needs to be further improved.With the development of deep learning and the increase of the public datasets,the problem can be solved well.In a single stream convolutional network,the focus is usually on spatial features,ignoring the contextual information that exists in the video sequence,and the two-stream network similuates the human visual process,while processing spatial information,it also has a better understanding of the timing information in the video,so the two-stream network is used as the model structure,at the same time,in order to obtain the hidden time information between frames in the image sequence,the LSTM structure is added to the model.In the two-stream,one of the network inputs the original image sequence,and the other inputs the processing gradient edge detection map.The final result is a weighted fusion of the two network results,experiments are conducted on the public facial expression dataset,and the final result proved the effectiveness of the network structure.(2)In the video sequence,the intensity of the expression contained in each face image is different,According to the different contribution of each image,an approrpriate method is used to distinguish,this thesis proposes a video expression recognition network combined with attention mechanism,using an end-to-end CNN-RNN network structure,and using an attenation mechanism after the RNN network,Specifically,the CNN part used is resnet,and the rnn srtucture is bidirectional LSTM network,the main workflow of the algorithm is to transfer the feadvanced abstract features learned by the convolutional part to bidirectional LSTM network to absorb the time dependence between the image sequence.After obtaining the final representation of the sequence,the attention mechanism is finally used to increase the weight coefficients of the salient feature and reduce the impact of secondary features.The experiments are conducted on public datasets,compared with other algorithms,the effectiveness of the network in this thesis is proved.
Keywords/Search Tags:Deep learning, Facial expression recognition, Edge detection, Spatio-temporal feature, Attention mechanism
PDF Full Text Request
Related items