Font Size: a A A

Research And Implementation Of Multimodal Emotion Recognition And Emotion Reason Pair Extraction Algorithm

Posted on:2024-01-29Degree:MasterType:Thesis
Country:ChinaCandidate:X J FengFull Text:PDF
GTID:2558306920955699Subject:Software engineering
Abstract/Summary:PDF Full Text Request
With the diversification of information,many research methods in the field of multimodal information have been proposed one after another.The multimodal domain is one of the hot AI topics in recent years.For example,captions,images as well as speech for video;CT maps,PET maps and MRI maps for medicine;semantics and context for text,etc.It can be seen that multimodal data is mainly divided into multimodal data of the same object and the same type of modal data from different perspectives(sensors).In this paper,we mainly take multimodality as the starting point to extract personal emotions by first combining multimodal information of videos,and then use multimodality of texts to effectively extract their emotional reasons on which to use as the emotional analysis of small videos as well as the proof of arguments.This paper investigates the tasks of video sentiment recognition and sentiment cause pair extraction,which can provide an important foundation for the downstream research tasks of sentiment analysis.Nowadays,both tasks still have their shortcomings in their respective frameworks: for example,video sentiment analysis is limited by traditional coding methods,which cannot fully express their respective modal data information so that the prediction fails to express the true sentiment of the sentence;the lack of using textual context for sentiment reason pair extraction leads to dynamic word vectors failing to express their true meaning specifically according to the context.In this paper,we further explore the two issues raised above and propose innovative points that are different from previous studies as follows.1.In this paper,a multimodal sentiment recognition algorithm based on Quantum Self-Attention Networks With Residual Modules(QRSAN)is proposed.The algorithm enhances the ability to represent each modal data based on quantum computing and addresses the failure of existing studies to fully represent each modal data.By subjecting the data to four operations,namely quantum encoding,quantum computing,multimodal fusion and measurement,the modal representations with enhanced expressive power are obtained.The quantum modal representation effectively alleviates the traditional neural network representation capability and improves the multimodal removal of redundant information and information consistency.The algorithm can effectively improve the emotion recognition of the model.2.In this paper,an Emotion-Cause Pair Extraction model based on context and semantic dual feature fusion(CSF-ECPE)is proposed.The model utilizes Bi LSTM network and GCN network to construct contextual and semantic emotion-cause pair features,respectively.Then,the attention mechanism with clause location features is used to fuse the above two sentiment cause pair features.For cause and emotion extraction,integrating semantic features into the contextual feature vector can improve the accuracy of cause and emotion extraction.For sentiment reason pair extraction,the accuracy of sentiment reason pair extraction is improved by fusing semantic word vectors with positional information and contextual features.In this paper,through experiment one,a multimodal emotion recognition algorithm based on quantum residual attention mechanism is able to perform effective emotion recognition for a video.In addition Experiment II also analyzes the sentiment reason pair analysis algorithm combining text and context of video sentiment reasons,thus giving a reasonable basis for video sentiment.In this paper,multimodal emotion recognition algorithm and sentiment reason pair analysis algorithm are proposed respectively: the former mainly combines the video characteristics and separates text data,image data and speech data,which are three modal data for pre-processing and training models.Finally,the sentiment of the video is predicted.The latter combines text features to construct contextual features and semantic features of the language model.Then,these two features are effectively fused to predict sentiment reason pairs.
Keywords/Search Tags:multimodal emotion recognition, emotion cause pair extraction, attentional mechanisms, quantum neural networks, graph neural networks
PDF Full Text Request
Related items