Font Size: a A A

Research On Key Technologies Of Sentiment Analysis Based On Multimodal Fusion

Posted on:2022-05-10Degree:MasterType:Thesis
Country:ChinaCandidate:J T YangFull Text:PDF
GTID:2518306524490484Subject:Master of Engineering
Abstract/Summary:PDF Full Text Request
Human language is usually a mixture of natural language,body and facial movements as well as acoustic behavior.It contains a lot of emotional information,which is an important factor that affects human emotional tendencies and behavioral intentions.Sentiment analysis has achieved good results in the fields of text,speech,and images.As an emerging interdisciplinary direction in the field of sentiment analysis,multimodal sentiment analysis can not only capture information in a single modality,but also capture the potential information between modalities,so that the machine can recognize emotions in a way that is more in line with humans.Constructing an algorithm model for processing multimodal time series data contains the following common problems:1)The inherent sampling rate of multimodal data causes each modal sequence to be non-aligned;2)The context dependence between cross-modal elements.The problems mentioned above will affect the fusion effect of cross-modal elements.The thesis has done the following work in response to those problems:1.The thesis designs a cross-modal attention mechanism that can consider both the local word-level multimodal element alignment and the global context dependence of cross-modal elements.In an end-to-end manner,the cross-modal attention mechanism captures hidden information and alignment among cross-modalities through cross-attention between asynchronous long-modal sequences in multimodal data,thus realizes the alignment and fusion of multimodal features.At the same time,for the data that has been forced to align,the model in this thesis can directly perform three-modal cross-modal interaction in the cross-modal stage,and can further enhance the model’s effect in the case of unnatural expression of emotions.Then it embeds the cross-modal attention mechanism into the Transformer network to construct the multimodal sentiment analysis model of this thesis.2.At the same time,the thesis also considers the possible expression bottlenecks in the standard Transformer network in the case of non-aligned data,and introduces the“taking” mechanism of the attention head to obtain better performance.3.On the basis of the proposed model,the thesis follows the development method of separating the front and back ends of the Web system to design and implement a multimodal sentiment analysis system that can be deployed in a distributed manner with good performance.The system supports offline analysis and online analysis of individual emotional monologues,which has certain value for research,social and market applications.In this thesis,the proposed algorithm is tested on aligned and unaligned data.Compared with the current best results,the algorithm proposed in the thesis achieves good results in terms of accuracy.In general,it has obtained a maximum increase of about 0.5% in the aligned data.Specifically,in the unaligned data of the CMU_MOSI and CMU_MOSEI dataset,the accuracy has improved about 0.5%~1%.As for the unaligned data of the IEMOCAP dataset,the accuracy has improved about 1.5%~3%,Furthermore,the accuracy has improved 5%-10% compared with the models depending on the alignment assumption and CTC algorithm,which confirms the effectiveness of the model in this thesis.
Keywords/Search Tags:multimodal, sentiment analysis, attention mechanism, feature fusion, transformer network
PDF Full Text Request
Related items