Font Size: a A A

Research On Emotion Recognition Based On Multimodal Fusion

Posted on:2022-11-21Degree:MasterType:Thesis
Country:ChinaCandidate:X H ChenFull Text:PDF
GTID:2518306764954559Subject:Automation Technology
Abstract/Summary:PDF Full Text Request
With the rapid development of artificial intelligence,people have higher requirements for humanization and intelligence of human-computer interaction,which makes emotion computing become one of the current research hotspots.As a branch of emotion computing,emotion recognition has broad application prospects.At present,the main research methods of emotion recognition include emotion recognition based on single mode and emotion recognition based on multimodal fusion.In the research of emotion recognition,due to the single modal information can not express the emotion information completely,researchers gradually change from the research of single modal emotion recognition to the research of multimodal emotion recognition.Nowadays,the multimodal emotion recognition algorithms used by researchers can not make full use of the contextual correlation information between multi-modes,which leads to the problem of low accuracy of emotion recognition and classification.Therefore,aiming at the above problems,this paper proposes a context-based low-rank tensor multimodal fusion network model,which can solve the problems existing in current emotion recognition algorithms,and this model has research significance for emotion recognition.In multimodal emotion recognition tasks,researchers use an end-to-end learning tensor fusion network which can model dynamic interactions within and between modes,thus improving emotion recognition accuracy in complex scenarios.The key of tensor fusion network is to convert the input modes into tensors and train them accordingly,but it will increase dimension and computational complexity exponentially.In order to solve the problem that current multimodal emotion recognition algorithms fail to make full use of inter-information correlation,this paper proposes to use Gate Recurrent Unit(GRU)to model intra-modal context correlation,which can not only make full use of valid information in video data.Moreover,the network model proposed in this paper has the ability to transmit context information in training,so as to improve the performance of fusion classification.The low-rank tensor fusion network algorithm based on context modeling proposed in this paper solves the problems of high dimension and exponential increase in computational complexity in inter-modal fusion,and improves the classification efficiency of multimodal emotion recognition.To further verify the classification efficiency and accuracy of the proposed algorithm,experiments were carried out on three different data sets of CMU-MOSI,POM and IEMOCAP.Experimental results show that,compared with the tensor fusion network method,the proposed algorithm improves the classification accuracy of emotion recognition by 2.9%,1.3% and 12.2% respectively on three kinds of data sets.Moreover,compared with other methods,the proposed method has significantly improved the classification efficiency and accuracy.Therefore,the algorithm proposed in this paper provides a theoretical reference for the research of multimodal fusion emotion recognition.
Keywords/Search Tags:Artificial intelligence, Emotiom recognition, Convolutional neural network, Information Transport, Multimodal fusion
PDF Full Text Request
Related items