The Research On Multimodal Sentiment Analysis Based On Deep Neural Network

Posted on:2024-08-21

Degree:Master

Type:Thesis

Country:China

Candidate:J W Guo

Full Text:PDF

GTID:2568307103474504

Subject:Computer Science and Technology

Abstract/Summary:

PDF Full Text Request

Nowadays,the development of artificial intelligence has entered a brand new stage.People begin to expect that existing intelligent systems can interact in a friendly,vivid,natural and harmonious way like humans.To this end,a new field of computer science,emotional computing,has been proposed by researchers,whose core idea is to make computers capable of recognizing and expressing emotions like humans,thus making human-computer interaction more natural.In everyday life scenarios,humans express emotions or emphasize specific points of view mainly through their voices,facial expressions,and the content they describe.This kind of emotional expression involves not only verbal information but also non-verbal behavioral information such as visual and auditory information that occurs along with verbal information and becomes one of the important ways of emotional expression.In order to accurately identify the emotions that humans want to express,it is necessary to use multimodal data(mainly including visual,auditory,and verbal modalities)to conduct corresponding research work on emotion analysis,so as to empower computers to perceive human emotions.This paper focuses on sentiment analysis based on multimodal fusion,aiming to address the problems of inadequate cross-modal fusion,heterogeneity of multimodal data sources,and limitations of modal information representation.The main work includes the following three points:(1)Aiming at the problem of simple and insufficient interaction of multimodal data in the process of cross-modal interaction,this paper proposes a multimodal fusion model based on a multi-perspective graph attention mechanism,which models the complex multimodal data introduced on a non-Euclidean structure data structure,and simulates the potentially complex multi-relational interaction between multimodal information with the powerful expressiveness of the graph structure model.Multimodal data from different perspectives are transformed into multimodal interaction graphs with heterogeneous nodes,and full interaction between different modalities is accomplished in the multimodal interaction graphs to unleash the full expressive power of multimodal interactions.The experiments are conducted on a publicly available multimodal sentiment analysis dataset,and the obtained results validate the effectiveness of the proposed multimodal fusion model based on the attention mechanism of the multi-perspective graph in this paper by achieving better results in the sentiment analysis task compared with previous models of multimodal fusion algorithms.(2)Aiming at the problem of unaligned heterogeneous multimodal data sources,this paper proposes a Cross Hyper-modality Fusion Network based on unaligned multimodal information.By using the original unaligned multimodal data sources,the cross hyper-modality interaction between the language modality and the behavioral information accompanying the language modality is accomplished without a prealignment operation.In this cross hyper-modality interaction,the non-verbal behavioral information accompanying the language modality is used to dynamically adjust the position of the words in the semantic space,thus clearly expressing the real emotional state that the subject wants to convey in the different non-verbal behavioral information environments.It is worth noting that the multimodal interaction existing cross hyper-modality fusion network is a direct interaction between multiple modalities,which improves the efficiency of multimodal fusion.(3)Aiming at the problem of restricted modal information representation in the modal fusion process,this paper proposes a one-way bimodal fusion network model.Unlike previous approaches that use a joint representation to represent multimodal modal information projected in a joint space,the unidirectional bimodal fusion network first creates an independent representation for each modality and then uses a cross-modal attention mechanism to achieve fusion between verbal-visual and verbalauditory modalities,thus establishing the dominance of textual modality in the sentiment analysis task.In summary,the three multimodal sentiment analysis research methods proposed in this paper can effectively solve the problems of insufficient cross-modal fusion and restricted multimodal data source heterogeneity in terms of modal representation in the multimodal fusion process.These research methods provide new ideas and techniques for the field of multimodal sentiment analysis.

Keywords/Search Tags:

Emotion Recognition, Multimodal Sentiment Analysis, Multimodal Fusion, Cross-Modality Interaction, Modality Representation

PDF Full Text Request

Related items

1	Research On Multimodal Sentiment Analysis Method Based On Deep Learning
2	Research On Multimodal Sentiment Analysis Via Hierarchical Cross-Modal Attention
3	Multimodal Sentiment Analysis For Text,Audio And Video
4	The Research On Multimodal Fusion Emotion Recognition Based On Deep Learning
5	Research And Design Of Speech And Text Fusion Multimodal Emotion Recognition Scheme Based On Deep Learning
6	Research On Complex Sentiment Analysis Method Of Network Discourse Based On Multimodal Sentiment Computing
7	Research On Multimodal Data Fusion Methods
8	A Multimodal Discourse Analysis Of The TV Documentary For International Communication
9	Research On Key Technologies Of Infrared-Visible Light Cross-Modality Face Recognition
10	Cross-Modality Semantic Integration and Robust Interpretation of Multimodal User Interactions