Multi-modal Emotion Recognition Based On Deep Learning

Posted on:2021-02-25

Degree:Master

Type:Thesis

Country:China

Candidate:X Zhang

Full Text:PDF

GTID:2428330611498052

Subject:Microelectronics and Solid State Electronics

Abstract/Summary:

PDF Full Text Request

The research of emotion recognition is an important branch in the field of emotion computing.With the continuous development of artificial intelligence technology,human-computer interaction experience is constantly pursuing more humanization and intelligence.Emotion recognition has become a research hotspot.Single mode emotion recognition often has the disadvantages of incomplete information,strong interference and low recognition rate.Recently,multi-mode emotion recognition has been widely used in many fields It has been widely concerned by researchers,and a lot of research work has been carried out in the field of speech,video,text and physiological signal emotion recognition.Multimodal emotion recognition can complement each other by fusing the information between different modes,so as to improve the final recognition rate.In this paper,we build a multi-modal emotion recognition model based on voice,video and text.This paper studies the effective feature extraction methods for speech,video and text.For speech information input,this paper uses long-term and short-term memory neural network(LSTM)for speech feature extraction.Because the output of each time of speech signal is related to the front and back time,this network can make better use of the information of the front and back time of speech signal;for visual For the input of frequency information,this paper uses a dense connected convolutional neural network(DenseNet)to extract image features,which breaks away from the fixed thinking of deepening network layers(ResNet)and widening network structure(perception)to improve network performance.From the perspective of features,through feature reuse and bypass setting,the network parameters are greatly reduced,To a certain extent,it alleviates the problem of gradient disappearance;for text feature extraction,this paper uses LSTM neural network,which can effectively extract emotional semantic and word order information.In order to fuse the information of three modes effectively,this paper studies the fusion method of multi-modal emotion recognition.Among them,the fusion method based on feature layer can effectively use the information of each mode,but the direct cascade feature layer fusion method only splices the output emotion feature vectors of each mode.In this paper,attention mechanism is introduced into feature layer fusion The mechanism obtains a reasonable weight according to the data set distribution through learning,and adds the weight in the final feature fusion,making the multi-modal emotion recognition results more accurate.In this paper,the single-mode,dual-mode and multi-mode comparative tests are designed,and five kinds of output,four kinds of output,three kinds of output and two kinds of output are carried out for ten kinds of emotion classification in the data set of IEMOCAP.The experimental results show that the accuracy of bimodal emotion recognition is 6.2%higher than that of single-mode emotion recognition,and the accuracy of multimodal emotion recognition is 8.98%higher than that of bimodal emotion recognition.The design verifies the effectiveness of multimodal emotion recognition.

Keywords/Search Tags:

multimodal emotion recognition, deep learning, feature level fusion, attention mechanism

PDF Full Text Request

Related items

1	Based On Multimodal Feature Emotion Recognition Research
2	Research On Deep Learning-Based Bimodal Emotion Recognition In Open Domain Dialogue Systems
3	Design Of Emotion Recognition System Based On Multimodal Feature Fusion
4	Multimodal Emotion Recognition Algorithm Based On Deep Learning
5	The Study Of Multimodal Emotion Recognition Based On Text,Speech And Video
6	A Study Of Deep Learning Based Multimodal Emotion Recognition
7	The Research On Multimodal Fusion Emotion Recognition Based On Deep Learning
8	Research On Feature Fusion Method Of Speech Emotion Recognition Based On Deep Learning
9	Research On Multimodal Emotion Recognition Based On Deep Learning
10	Multimodal Emotion Recognition Based On Deep Learning