Emotion Recognition Based On Un-aligned Multimodal Framework

Posted on:2022-12-05

Degree:Master

Type:Thesis

Country:China

Candidate:K R Wang

Full Text:PDF

GTID:2518306767977519

Subject:Computer Software and Application of Computer

Abstract/Summary:

PDF Full Text Request

At present,single modal emotion recognition technology has been widely used in many fields,such as facial expression recognition and speech emotion recognition,etc.However,due to the single modal state,its single modal feature information sometimes appears insufficient data or seriously affected by the outside world,the multi-modal method is being paid more and more attention because of its diversity of modal and the complementarity of features between modals.Although the framework strategy of multi-modal fusion has made up for the deficiency of single modal to a certain extent,because different modal have certain heterogeneity and similarity,it is very important to select the fusion structure which is suitable for different modal fusion without increasing feature redundancy;and in order to establish the relationship between different modal features,multi-modal structures often use the alignment information between modal,and ignore the asynchronous situation between different modal.Based on this,in view of the asynchronous situation of multi-modal features,this paper combines the attention mechanism with transformer mechanism,and uses selfattention mechanism to form a cross-modal attention module,so that each feature modal can have a closer relationship without need of alignment.In addition,when dealing with text features,the paragraph vector is embedded in the text feature and applied to the corresponding depression detection experiment.Secondly,in order to solve the limitation of the experimental data shortage of some modal,this paper applies GAN mechanism to the experiment to achieve the purpose of feature enhancement.Combining GAN mechanism with attention mechanism,the depression degree is tested under the condition of multimodal features.This paper conducts experiments on the IEMOCAP dataset,AVEC2017 dataset and AVEC2019 dataset,and finally uses Bo VW,open SMILE,ASR and paragraph vector embedding to process video,audio and text features respectively,and the emotion recognition and depression prediction were performed by using the un-aligned cross-modal attention modal,both experiments have achieved good experimental results,it fully shows that the un-aligned cross-modal attention module proposed in this paper has good performance.

Keywords/Search Tags:

Un-aligned Features, Cross-modal Attention, GAN, Multi-modal Emotion Recognition, Depression Detection

PDF Full Text Request

Related items

1	Research On Speech Emotion Recognition Method Based On Multi-feature And Multi-modal Fusion
2	Emotion Recognition Based On Multi-modal Information Fusion
3	Speech Emotion Recognition Based On Deep Learning
4	Research Of Emotion Recognition Based On Multi-modal Fusion
5	Research On Emotion Recognition Of Monomodal Speech And Multimodal Speech Vision Based On Transfer Learning
6	Research Of Multi-Modal Emotion Recognition Based On Deep Learning
7	An Optimized Approach To Cross-Modal Retrieval Based On Multi-level Attention Mechanism
8	Research On Multi-modal Emotion Recognition Based On Broad Learning System
9	Multi-modal Speech Emotion Recognition Based On The Attention Mechanism
10	The Research Of Multi-Modal Fusing Based Emotion Recognition