Research On Multimodal Emotion Recognition And Human-computer Interaction In Virtual Environment

Posted on:2022-08-22

Degree:Master

Type:Thesis

Country:China

Candidate:J G Dong

Full Text:PDF

GTID:2518306575964699

Subject:Control Science and Engineering

Abstract/Summary:

PDF Full Text Request

Multimodal emotion recognition is a hot and challenging research field in artificial intelligence.It mainly improves the performance of emotion recognition by fusing multimodal emotion information.The main difficulty is whether it can learn more discriminative unimodal emotional information and whether it can fully mine multi-modal complementary information through multi-modal fusion method.Speech and facial expressions are two of the most natural and effective ways for hunman beings to express emotions.This thesis studies multimodal emotion recognition based on speech and expression and its natural interaction in virtual environment.It mainly includes unimodal emotion feature learning,multimodal emotion fusion and humancomputer interaction in virtual environment.Specific research contents are as follows:1.From the perspective of unimodal emotional feature learning,this thesis combines Bidirectional Long Short-Term Memory(Bi LSTM)and Convolutional Neural Network(CNN)for speech emotion recognition to learn context related information and local highlevel features in speech signal.For facial expression recognition,a neural network model method based on small scale convolution kernel is proposed in this thesis.By using small scale convolution kernel to replace large scale convolution kernel,the network layer number is deepened and the nonlinear expression ability is enhanced to learn the local highlevel features of facial expression.Experimental results on the IEMOCAP dataset show the recognition rate speech emotion and expression recognition reached 58.97% and 60.19%,respectively.2.In the aspect of multi-modal emotion fusion,this thesis first uses feature level fusion method to fuse speech emotion features and facial expression features.And then,this thesis proposes a model level fusion method based on neural network.After the fusion of speech emotion features and expression features,the complementary information between the fused features is learned by neural network.The experimental results on IEMOCAP dataset show that the recognition rate of the model level fusion method reaches 70.24%,which shows that the method is effective.3.Finally,this thesis applys the multimodal emotion recognition algorithm to the virtual environment interaction system to verify the effectiveness of the multimodal emotion recognition algorithm in the real scene.Through multiple comparative experiments,the virtual environment interaction system can correctly recognize the user's emotions.Virtual characters can also make corresponding interactive actions.

Keywords/Search Tags:

virtual environment interaction, speech emotion recognition, multimodal fusion, deep learning

PDF Full Text Request

Related items

1	The Research On Multimodal Fusion Emotion Recognition Based On Deep Learning
2	Multimodal Emotion Recognition Based On Deep Learning
3	The Research Of Emotion Recognition Integrating Text And Speech Features Based On Deep Learning
4	Multimodal Emotion Recognition From Speech And Text
5	Multimodal Emotion Recognition Based On Face And Speech And The Application In Reasoning Of Robot Service Tasks
6	Research On Emotion Recognition Of Monomodal Speech And Multimodal Speech Vision Based On Transfer Learning
7	Research On Feature Fusion Method Of Speech Emotion Recognition Based On Deep Learning
8	Research On Key Techniques Of Speech Emotion Recognition
9	Research On Emotion Recognition Method Based On Multimodal Deep Learning
10	Research On Speech Emotion Recognition Based On Deep Learning