Font Size: a A A

Video-driven Dynamic Emotion Analysis

Posted on:2020-06-27Degree:MasterType:Thesis
Country:ChinaCandidate:D B MengFull Text:PDF
GTID:2428330596964237Subject:Control engineering
Abstract/Summary:PDF Full Text Request
Recently,with the great improvement of artificial intelligence,computer vision,and face relating technologies,intelligent robots are widely used in various fields.People hope that computers have the same emotions as people,and have a strong desire that computers can understand human intentions.The facial expression,as the most important biological signal and the emotional signal of human beings,has important application value in human-computer interaction,medical diagnosis,robot manufacturing,and investigation and interrogation.How to accurately understand facial expressions naturally leads to the widespread attention of artificial intelligence researchers.The traditional expression recognition method mainly based on the hand-crafted feature.Recently,with the rapid developed of DeepLearning.Deep learning based algorithm has become the mainstream method of expression recognition task.The video expression recognition task mainly focuses on how to integrate video frame audio and other modalities to obtain video-level expression features.The limitation of these existing aggregation methods is that they ignore the importance of frames for emotion classification.Therefore,we propose an expression recognition framework based on attention mechanism,which can give lower weights to the inconspicuous expression frames,and give higher weights to the obvious expression frames.then the weights were used to aggregate a discriminative video-level expression features.This method achieves an accuracy of 99.69% on CK+ data,which is the best recognition at present.And achieving 51.181% accuracy on AFEW,compared to other CNN based method.In recent years,multimodal expression recognition has made rapid progress.The AVEC International Emotion Recognition Competition held from 2011 focus on comparing the relative merits of the video and audio information for emotion recognition and establish to what extent fusion of the approaches is possible and beneficial.The EmotiW International Emotion Recognition Competition,held in 2013,focuses on the ability to recognize emotions under real-world scene challenges.This paper proposes a multi-modal expression recognition framework which achieved the ninth place in EmotiW2018,17% classification higher than the baseline method.Computer classification of facial expressions requires large amounts of data and this data needs to reflect the diversity of conditions seen in real applications.Currently,the video expression data mainly collected from the lab-controlled environment or from the movie expression clips.The expressions collected from lab-controlled environment lack the various condition of illumination,occlusion,and posture of the real scene and limited expression diversity.And the large-scale collection from the lab-controlled environment is expensive.The expression data from a movie clip containing complex background information,while it is too exaggerated compared with the real-life expression.Based on this view,we collect expression data from real life video to build a large-scale,reliable and realistic expression database.Therefore,to design an algorithm which can discover the latent truth from the inconsistent expression labels is one of our future work.In addition,the cost of labeling AUs and video emoticon data is expensive.This lead to the data of AUs and video are relatively small.Therefore,how to use sufficient image data to improve the performance of video expression recognition and AUs detection is our future work.
Keywords/Search Tags:Artificial intelligence, DeepLearning, Expression recognition, Video emotion database, Multimodal emotion recognition algorithm
PDF Full Text Request
Related items