Font Size: a A A

Multimodal Sentiment Recognition Based On Expression,Text And Speech

Posted on:2022-08-06Degree:MasterType:Thesis
Country:ChinaCandidate:Y R MaFull Text:PDF
GTID:2518306557971309Subject:Electronics and Communications Engineering
Abstract/Summary:PDF Full Text Request
At present,sentiment recognition has become a hot spot in pattern recognition.Researchers have a long history of single-modal sentiment recognition.Considering the disadvantages of single-modal sentiment recognition,researchers have gradually shifted from the study of single-modal sentiment recognition to the study of multi-modal sentiment recognition.Multi-modal data includes,but is not limited to,facial,text,and speech modal data.There are many ways to express sentiments.Among them,expression,text and speech are the most direct and reliable emotional information carriers.Therefore,the research of emotion recognition that comprehensively considers expression,text and speech modalities has very important research significance and practical significance.This paper analyzes the relevant situation of multi-modal sentiment recognition,selects a multimodal sentiment data set containing expression,text and speech,carries out sentiment recognition on the three modalities of expression,text and speech,and integrates them on the decision-making layer.The research of multi-modal sentiment analysis,the main work content is as follows:(1)Aiming at the cumbersome problem of the traditional manual feature extraction process,this paper studies the basic theories and methods based on the VGG16 pre-training model on the expression modal.In order to be able to extract more feature information about the expression,the pre-trained model Combining with face key point detection to obtain expression sentiment features,and then using the softmax classifier for sentiment classification,the accuracy of sentiment classification reaches 71.01%,which is 17.6% higher than the accuracy of traditional feature methods,which verifies the effectiveness of this method.(2)Aiming at the problem that Word2 Vec is limited by the local window and it is difficult to effectively use the global vocabulary co-occurrence statistical information,this paper studies the method of word vector feature extraction through Glo Ve in the text mode,in order to better consider the context of the relationship.For the problem of ambiguity,this article uses the combination of the Glo Ve pre-training model and Google's open-source BERT pre-training model to make the extracted emotional features richer.The accuracy of sentiment classification reaches 73.81%,which is 8.87%higher than that of traditional feature methods,which verifies the effectiveness of this method.(3)Aiming at the problem of insufficient feature representation ability of traditional manual extraction,this paper studies the basic theories and methods of VGGish pre-training model in speech emotion recognition in the speech modal.First,the speech is preprocessed and then pre-processed by VGGish.Train the network model to extract speech features,and finally use the softmax classifier for emotion classification.The accuracy of sentiment classification reaches 67.92%,which is 7.03%higher than the accuracy of traditional feature methods,which verifies the effectiveness of this method.(4)Aiming at the problems of incomplete information and strong interference in single-modal emotion recognition,this paper studies the different fusion methods of multi-modal emotion recognition,and conducts experimental verification.The experimental results show that different weights are assigned to each mode The recognition effect is also different when making decision fusion.In order to assign more accurate weights to each modal,this paper studies a decision-level fusion method based on a weighted matrix,and fuse the above-mentioned single-modal emotion recognition to obtain the final multi-modal emotion recognition result.The experimental results show that the multi-modal emotion recognition system constructed by the decision fusion algorithm has a higher recognition rate than the single-modal emotion recognition system.
Keywords/Search Tags:expression sentiment recognition, text sentiment recognition, speech sentiment recognition, multimodal sentiment recognition, decision fusion
PDF Full Text Request
Related items