With the development of information technology,social media users have in With the development of information technology,social media users have increasingly diversified ways of expression.It has become the norm to use multi-modal content such as text,image,video,etc.Multimodal data is abundant in social network content data.As the social media user base continues to expand,the multimodal data on the social network is growing rapidly.In this mass of multi-modal data,contains the emotional information expressed by users.The sentiment analysis of these multi-modal data is of great significance to opinion mining,policy analysis and stock market forecasting.This paper conducts emotional analysis and research on the multimodal data combining images and texts which are abundant in social media.There are two characteristics of multimodal data in social media.First,images and texts contain their own unique information,and effective acquisition of emotional information of each mode is the basis for improving the effect of multimodal sentiment analysis.Second,there are semantic relations among multi-modal data.There is a wealth of interactive information between the modes.Therefore,how to effectively extract the feature representation of sentiment semantics of each modes and how to effectively learn the deeper relational semantic information among multiple modes is an urgent problem for social multimodal sentiment analysis.To solve these problems,this paper proposes a multi-modal sentiment analysis method based on attention mechanism.The specific research contents are as follows:1)In order to obtain more effective representation of sentiment characteristics and reduce the interference of redundant information irrelevant to sentiment this paper proposes a multi-modal sentiment classification method based on attention neural network.The more important parts of sentiment classification were acquired through network training,and the feature representation of sentiment enhancement was further weighted.Then,according to the different importance degree of joint feature representation,the weighted learning of each modal feature is conducted again.Finally,the tensor fusion strategy is used to obtain the joint feature vector representation of pictures and texts,and then the sentiment classification is carried out.In this method,attention has been used to enhance the expression of emotional information by modal characteristics and reduce the interference of redundant information.The use of tensor fusion can effectively obtain the interaction information between the modes so as to improve the accuracy of sentiment classification.2)In order to learn the deep interaction information between modes and identify the implicit sentiment expression of multi-modal data,a multi-modal fusion method based on collaborative attention is proposed based on the previous research.The deep interaction information between the modes was deeply mined by the way that the modes guide each other to learn the weight of attention.The method performs well in the task of irony recognition,and the experiment proves that the method can effectively and deeply learn the interactive information between the modes.On the other hand,an irony recognition method based on decision fusion is proposed,which integrates the output results of several irony recognition models and improves the accuracy of irony recognition.At the same time,this paper also proposes an irony recognition method based on decision fusion,which integrates the output results of several irony recognition models and improves the accuracy of irony recognition. |