Design Of Emotion Recognition System Based On Multimodal Feature Fusion

Posted on:2022-12-14

Degree:Master

Type:Thesis

Country:China

Candidate:W Chen

Full Text:PDF

GTID:2518306752493474

Subject:Computer Software and Application of Computer

Abstract/Summary:

PDF Full Text Request

With the rapid development of artificial intelligence,the use of computers to perceive various emotions like humans has become a hot research field.As an emerging interdisciplinary direction,multimodal emotion recognition can improve the effect of emotion recognition by utilizing the interaction information between different modalities while obtaining information from a single modality.However,there are many problems in current multimodal emotion recognition: insufficient utilization of modal features,and information redundancy or conflict caused by different modal features.Starting from improving the recognition performance of the emotion recognition model,this paper improves the multi-modal fusion mechanism,and conducts a beneficial exploration on the design of emotion recognition system based on multi-modal feature fusion.The specific work is as follows:(1)For single-modal feature extraction,this paper takes into account the characteristics of different modal data,analyzes the characteristics of different neural networks,and uses long and short-term memory networks(LSTM)to extract contextual information for text and acoustic modalities with complex time-varying features.The features of the visual modality are multi-scale convolutional neural network(MSCNN)for visual modalities to extract low-level features from images.(2)For multi-modal feature fusion,firstly,reconstruct the data set used,redistribute emotion categories,balance the proportion of each category of emotion,and achieve multi-modal alignment;In the modal stage,the cross-modal interaction of the three modalities is realized by using the cross-modal attention mechanism,and the low-level features in the source modality are used to enhance the target modality features.The cross-modal attention mechanism is embedded into the improved Transformer network,which reduces the complexity of the network model.The multi-head attention mechanism is used to enhance the representation ability of the network and improve the effect of modal fusion.(3)Based on the proposed algorithm model,this paper designs and implements a multi-modal emotion recognition system,which can display the effect more intuitively.The system implements emotion analysis on offline video clips,and at the same time,it can obtain audio and video data by calling the user's camera and microphone,and use the trained emotion recognition model to realize the function of real-time multimodal emotion recognition.This paper conducts experiments on the proposed algorithm models on different datasets,and compares representative and state-of-the-art models.The model proposed in this study achieved an accuracy of 84.1% and an F1 score of 82.9% on the IEMOCAP dataset,and achieved a recognition rate of 82.7% and 82.4% on the CMU-MOSI and CMU-MOSEI datasets,respectively.The rate is improved by3%-5%,which proves that the model proposed in this study has good performance.

Keywords/Search Tags:

Multimodal emotion recognition, Attention mechanism, Feature fusion, Transformer network

PDF Full Text Request

Related items

1	Based On Multimodal Feature Emotion Recognition Research
2	The Study Of Multimodal Emotion Recognition Based On Text,Speech And Video
3	Research On Multi-modal Emotion Recognition Based On UDP-MIF
4	A Study Of Deep Learning Based Multimodal Emotion Recognition
5	Research On Feature Fusion Method Of Speech Emotion Recognition Based On Deep Learning
6	Research On Key Technologies Of Sentiment Analysis Based On Multimodal Fusion
7	Multi-modal Emotion Recognition Based On Deep Learning
8	Research On Multimodal Emotion Recognition Based On Feature Fusion
9	Research On Deep Learning-Based Bimodal Emotion Recognition In Open Domain Dialogue Systems
10	The Research On Multimodal Fusion Emotion Recognition Based On Deep Learning