Research On Multimodal Emotion Recognition In Conversations Based On Deep Learning

Posted on:2022-07-11

Degree:Master

Type:Thesis

Country:China

Candidate:X Q Shi

Full Text:PDF

GTID:2558307154976729

Subject:Engineering

Abstract/Summary:

PDF Full Text Request

Emotion recognition in conversations(ERC)is a hot topic in the field of natural language processing.It classifies the emotion of every utterance in every dialogue by mining the contained emotional information.In recent years,the task has been widely used in many emerging tasks,such as human-computer interaction,etc.Compared with emotion recognition in the text modality,the video modality and voice modality are added as supplementary information in multimodal emotion recognition.The following two points are mainly completed:(1)Based on attention mechanism,we propose a multi-modal fusion method,which fully considers the influence of interactive information between modalities on ERC.Specifically,the feature representations are extracted from the videos,including three modalities of text,voice,and video.We use the multi-head attention to capture the interactive information among these modalities.In this way,all of modalities can receive information from other modalities.In addition,since the text modality has more effective information in the ERC task,this method retains the original textual information and learns supplementary information.The experimental results show that,our method has significant effects on the mainstream databases IEMOCAP and MELD.(2)We also propose an Interactive Multimodal Attention Network(IMAN),which takes into account the influence of long-term context and speaker dependency on emotion classification.Specifically,the network uses a multimodal fusion method to obtain a refined utterance representation,which contains the interactive information among modalities.Then,we construct a conversational modeling network and propose three gated recurrent units(GRUs).They are the context GRU,the speaker GRU and the emotion GRU,respectively.The network uses the context information and speaker dependency to update the current utterance features and realizes the emotions of the utterances in the dialogues.Detailed evaluations on two databases IEMOCAP and MELD demonstrate that ours IMAN outperforms the state-of-the-art approaches.

Keywords/Search Tags:

Emotion recognition in conversations, Gated recurrent units, Multimodal, Attention mechanism

PDF Full Text Request

Related items

1	Speech Emotion Recognition Based On Deep Learning
2	Design And Implementation Of Speech Emotion Recognition Based On Neural Network And Attention Mechanis
3	Research On Multimodal Emotion Recognition In Natural Context Based On Deep Residual Shrinkage Network
4	Based On Multimodal Feature Emotion Recognition Research
5	The Study Of Multimodal Emotion Recognition Based On Text,Speech And Video
6	Research On Multi-modal Emotion Recognition Based On UDP-MIF
7	Multimodal Emotion Recognition Based On Audio And Video
8	Speech Emotion Recognition Based On Improved Convolutional Recurrent Neural Network
9	Research On Multi-modal Emotion Recognition Method Combining Speech And Expression
10	Emotion Recognition Based On Multimodal Feature Disentanglement Learning