Font Size: a A A

Multimodal Sentiment Analysis Based On Transformer And Multi Task Learning

Posted on:2024-02-02Degree:MasterType:Thesis
Country:ChinaCandidate:Q B GuoFull Text:PDF
GTID:2568307076473174Subject:Computer technology
Abstract/Summary:PDF Full Text Request
Sentiment analysis has important application value in many economic and social fields,such as market research,risk management,public opinion analysis and decision-making assistance.The rapid development of short video social networks,such as Tiktok and Kwai,forced researchers in the field of sentiment analysis to expand their research horizons from text content to multimedia content.Multimodal sentiment analysis aims to mine sentiment information from video content including text,audio,and visual data.At present,research on building mapping relationship between multimodal data and sentiment orientation based on deep learning technology has achieved some results,but many bottlenecks limit the effectiveness of sentiment prediction:(1)There are potential sentiment associations among modalities and in the context,and deep multimodal sentiment association mining and information fusion become difficult problems;(2)Sentiment information is often unevenly distributed in different modal data.It is difficult to make full use of each modal information to realize multimodal collaborative learning.In order to break through the above bottlenecks,this paper has conducted in-depth research on the main challenges in the field,and conducted multimodal sentiment analysis for video content based on relevant technologies such as Transformer and multi-task learning.The main work and contributions of this paper are as follows:1.This paper proposes a multimodal sentiment analysis model based on Transformer.This paper not only focuses on the sentiment relationship among modalities,but also aims at the contextual sentiment relationship among modalities to improve the semantic context.Through the cross-modal multi-head attention mechanism,multi-level association mining is carried out to build a sentiment association network that is interwoven with longitude and latitude.In the process of exploring the essential correlation between modes and the subject’s sentiment fluctuation,the model fully excavates the potential contextual sentiment correlation among modalities,and then more accurately identifies the real sentiment contained in the original data.2.This paper proposes a self-supervised unimodal label generation method.When the multimodal label is known,the unimodal label can be generated only by relying on the mapping relationship between the multimodal representation and the label without complex deep network.This method can realize automatic labeling of unimodal labels in stages,and quantify the mapping relationship between representations and labels from the representation space to generate unimodal weak labels.Finally,the model realizes multimodal collaborative learning in the context of incomplete sentiment annotation,which provides a new perspective for the full use of multimodal information.The experimental results on classic datasets in the field of multimodal sentiment analysis show that the model designed based on the above work content has achieved satisfactory results,and can surpass the baseline model in terms of accuracy and F1-score.
Keywords/Search Tags:Multimodal sentiment analysis, Transformer, Cross-modal sentiment association mining, Multi-task learning, Self-supervised label generation
PDF Full Text Request
Related items