Font Size: a A A

Research On Multimodal Emotion Recognition Based On Deep Learning

Posted on:2020-03-01Degree:MasterType:Thesis
Country:ChinaCandidate:H M ZhangFull Text:PDF
GTID:2428330623951442Subject:Software engineering
Abstract/Summary:PDF Full Text Request
Emotion recognition is a complex classification task and accurate expression of emotion requires multiple modalities.Most of the emotional data on social media is unstructured data,which has the characteristics of different structures,high dimensions and large amount of redundant information.In the feature extraction process of single modality,redundant information and noise data are easily generated inside the modality.Aiming at this,it is essential to design an emotion recognition model that can be used for feature extraction of different modalities,capture information dependencies and eliminate redundant information and noise data inside the single modality features.Meanwhile,there are also information dependencies between different modalities.Traditional feature fusion ignores the information dependencies between modalities.On the basis of maintaining the feature distribution of the original data,how to integrate multiple modalities for emotion recognition is another problem to be solved.In view of the above problems,this paper mainly focuses on the multimodal emotion recognition based on the text modality,audio modality,video modality and deep learning.In order to remove the redundant information and noise data in single modality features,this paper proposed a multimodal emotion recognition model based on Chisquare test.Firstly,features of every single modality are collected by their own independent feature extraction scheme this paper designed.Then,the LSTM structure is added to obtain the information dependencies inside the single modality.Next,this paper adopted Chi-square test feature selection method to remove the redundant information and noise data.Through a series of experimental results comparison,the model this paper designed performs best in combination with the chi-square test method.Compared with the baseline,the accuracy of the model this paper presented has increased by nearly 7%.On the other hand,every single modality has different feature distributions.There are also information dependencies between the different modalities.When features of one modality are sparse,other modalities will help emotional decision.To this end,this paper design a multimodal emotion recognition model based on decision-level feature fusion.Firstly,features of every single modality are collected by their own independent feature extraction scheme this paper designed.Then,the LSTM structure and chi-square test are used for feature selection and information dependencies captures,correspondingly.Subsequently,feature fusion is performed in feature-level fusion and decision-level fusion respectively.The LSTM structure is used to capture information dependencies between modalities,and SVM is used as a classifier.Through a series of experimental results comparison,the performance of the model this paper designed gains the best compared with the single modality emotion recognition model.Furthermore,the performance of the model based on decision-level fusion is better than that of the feature-level fusion model.Compared with the baseline,the accuracy of the model in this paper has increased by nearly 5.2%.
Keywords/Search Tags:multimodal, emotion recognition, deep learning, feature selection, feature fusion
PDF Full Text Request
Related items