Font Size: a A A

Research On Multi-modal Emotion Recognition Methods Based On Multi-task Learning And Attention Mechanism

Posted on:2022-11-22Degree:MasterType:Thesis
Country:ChinaCandidate:J Y WangFull Text:PDF
GTID:2518306614959929Subject:Computer Software and Application of Computer
Abstract/Summary:PDF Full Text Request
With the rapid development of the information age,people are getting closer and closer to the Internet,emotion recognition technologies that bring together artificial intelligence ? natural language processing and cognitive science have also seen a blowout development.Human emotions are usually expressed through natural language,voice,and facial and body language.Nowadays,the video social industry is rapidly rising,more and more users output their opinions through short videos,the amount of online multi-modal content is growing exponentially,emotion recognition technology has gradually evolved from the initial single-modal emotion recognition to multi-modal emotion recognition.Through the comprehensive analysis of the emotions expressed by the characters in multiple modalities,we can effectively grasp the overall,comprehensive and accurate emotional state expressed by the characters,and then better apply it in the direction of public opinion analysis,product demand mining and so on.The main task of multimodal emotion recognition is to combine the emotional information of multiple modes to identify the comprehensive emotional performance of characters.Firstly,aiming at the problem that the text modal feature extraction is not comprehensive enough,extracting word vectors by using BERT model to realize the polysemy representation of text elements and improve the richness of semantic information;Secondly,aiming at the problem that the traditional fusion methods do not consider the internal emotional interaction between modes and the fusion ability is not strong,a cross modal attention mechanism is proposed to mine the emotional contribution of each mode to the whole model,realizing the full interactive fusion of emotional information between modes;Finally,aiming at the problem that traditional methods focus on the emotional similarity between modes while ignoring their differences when processing multi-modal data,a multi-task learning framework is constructed,introducing the single-mode emotion recognition task as an auxiliary task to strengthen the attention of the model to the emotional differences.At the same time,a label generation module based on self-supervised learning and a momentum-based update strategy are proposed,generating stable emotion tags for multimodal tasks.Experiments are carried out by using the public multimodal data sets CMU?MOSI and CMU?MOSEI.The experiments show that the multimodal fusion emotion recognition model based on multitask learning and attention mechanism can effectively judge the emotional polarity of characters in the data sets more accurately,and the accuracy rates on the two data sets are 85.36% and 84.61% respectively,which is 2% ? 10% higher than the existing models,having excellent performance.
Keywords/Search Tags:multimodal data, emotion recognition, multi-task learning, attention mechanism
PDF Full Text Request
Related items