Font Size: a A A

Research On Expression Recognition Algrithom Based On Deep Learning

Posted on:2020-05-21Degree:MasterType:Thesis
Country:ChinaCandidate:T XiaFull Text:PDF
GTID:2428330620956156Subject:Information and Communication Engineering
Abstract/Summary:PDF Full Text Request
Communication is an eternal topic for humanity.Among the many communication methods of human,expression plays a vital role for the abundant information contained.When machine has the ability to read human expressions,it can be used as auxiliary equipments in many fields such as medical care,nursing care,education,and security.Therefore,it has become one of the typical problems in computer vision.Based on the form of input,expression recognition methods could be divided into two categlories,single static picture-based and image sequence-based.For single static expression image,the paper proposed a transfer learning-based method.As the traditional LBP code are not capable of descibing code-to-code distance,thus are not suitable for convolution network input,we mapped the traditional LBP code to a new three-dimension space as an input for the network for better description of code-to-code distance.Meanwhile,as the Inception network pre-trained on color image database has three-channel convolution kernel,which would cause dimension mismatch in grayscale image experiment,we padded the RGB channels with original grayscale values as complementary for mapped LBP input.Decision level fusion enable the mapped LBP input and padded image input works as complementary to each other,and achieved better performance than baseline method in FER-2013 database.For expression image sequence,this paper proposed a method based on landmark information and joint training of two networks.Firstly,we proposed a method to extract fixed length frames from variable length input image sequence,in which the typical change patterns of expression are contained.Additionally,we proposed a global deep network and a local deep network,both of which are independently trained with the landmark information as two independent discriminant models.Furthermore,we innovatively designed a fusion method as well as a joint training loss function that amplify inter-class differences for different expression and constrain the intra-class differences in same expression.Experiments indicate that the recognition rate of proposed network takes a lead in most of known methods in three database,CK+,Oulu-CASIA and MMI,and only lag behind the optimal model by slight degree in MMI database.However,the compute complextity is largely reduced,thus achieves better balance between recognition rate and computing resources.Furtherly,for expression image sequence,this paper proposed a method based on variable length 3D convolution network and siamese attention mechanism.Current researches mostly focus on common subject-independent task,while cross-database evaluation is rare and lack of universal protocol.The key challenge for both tasks is to extract features that effectively describe the pattern of expression.In this paper,we present a variable length 3-dimensional convolution network with siamese attention mechanism.Convolution kernel depth in 3-dimensional convolution network is much smaller than input channel numbers,thus introduce local receptive fields in both time domain and space domain,and output variable length high-level features whose dimension changes along with input channel numbers.While siamese network utilize the “neutral,intermediate,peak” frames from another subject that has same expression lable to provide attention weights for the extracted features.By computing th similarity between the high-level features of two networks,attention weights enable the network focusing on subject-indepndent features to make feature extraction more effective.Experiments indicate that the model successfully captured the critical change frames in image sequence input and achieved great performance in both subject-independent task and cross-database task.Addtionally,we recommedded an experiment design that could be used as comparison baseline for fair comparison in future research.
Keywords/Search Tags:Deep learning, Expression recognition, Transfer learning, Joint training, Siamese network, Attention mechanism
PDF Full Text Request
Related items