Font Size: a A A

Based On Multimodal Feature Emotion Recognition Research

Posted on:2022-12-25Degree:MasterType:Thesis
Country:ChinaCandidate:J TangFull Text:PDF
GTID:2518306779488964Subject:Computer Software and Application of Computer
Abstract/Summary:PDF Full Text Request
As a hot field of human-computer interaction,emotion recognition technology has been used in medicine,education,safe driving,e-commerce and other fields.Emotion is mainly expressed by expression,voice,text,posture and other rich emotional expression methods,but a single emotional expression method and the emotion expressed specifically are not simple linear relations.Emotion is the comprehensive embodiment of a series of human behavior and environmental factors.The research on emotion recognition should also be studied from multiple dimensions.The existing emotion recognition methods have low recognition accuracy,inaccurate prediction and complex preprocessing and post-processing problems.This paper starts with the multi-dimensional and multi-modal representation information of human body and environment,which improves the recognition accuracy.(1)In the field of emotion recognition,the existing gait based methods only use a single bone point space-time information or bone rotation information,and do not consider the fusion of bone point space-time information and rotation information.In this paper,an adaptive fusion emotion recognition method is proposed,which makes full use of gait features and combines bone space-time information with bone rotation angle.Firstly,the model uses the self encoder to extract the bone rotation information of human walking.Then,the spatiotemporal information of bone points is extracted by spatiotemporal map convolution neural network.Finally,in order to fully characterize the bone rotation information and bone point space-time information,a fusion network based on adaptive learning feature weight is proposed for feature fusion,and the fused features are classified.In the experiment based on the emotion gait data set,the results show that after fusing the bone space-time information and bone rotation angle information with the adaptive fusion network,compared with the latest HAP method,the AP values of sadness,anger and neutrality are increased by 5%,8% and 5%respectively;The average map value of the overall classification increased by 5%.(2)In the field of multimodal emotion recognition.The fusion of bone rotation features and bone space-time features for emotion recognition is beneficial to improve the accuracy of emotion recognition,but ignores the impact of environmental factors on personal emotion.In order to add the influence of background information on emotion,a multi-modal self supervised emotion recognition network integrating background information is proposed.The network integrates voice,text,expression and other modes and combines the background information for emotion recognition.The Sims video data set after masking facial information is extracted by3 D convnet to obtain the background information with timing.The background information is fused with modal data such as expression,voice and text respectively to reduce the instability of single-mode subtask training.Experiments on SIMS dataset show that MAE,Corr,Acc-2,F1-Score and other indicators have increased by 0.006,0.026,0.59 and 0.4 respectively compared with the previous dual mode fusion methods and the three mode fusion method.
Keywords/Search Tags:Multimodal emotion recognition, Spatiotemporal map convolution neural network, Attention mechanism, Feature fusion, Emotion recognition, Autoencoder
PDF Full Text Request
Related items