Font Size: a A A

Research On Feature Fusion Of Multimodal Data Based On Deep Learning

Posted on:2022-03-27Degree:MasterType:Thesis
Country:ChinaCandidate:K ZhangFull Text:PDF
GTID:2518306323960429Subject:Software engineering
Abstract/Summary:PDF Full Text Request
Multimodal big data is often composed of multiple data of different structural forms.The description of the same thing by these data of different modes has certain independence and strong relevance.Accurate and efficient extraction and processing of the fusion information hidden in multimodal big data is helpful to solve various multimodal data analysis tasks.In recent years,with the popularity of social media,more and more users are keen to express their feelings and opinions in different media forms at the same time on social media,which greatly promotes the multimodality of content uploaded by users on social networking sites.Most of the information published by users in social media has obvious emotional color.People use multimodal data including text,image,audio and even video content to express their opinions in social networks,which is essentially a problem of multimodal information fusion.Previous researches on multimodal sentiment analysis focused on extracting text and image features respectively,and then combining them for sentiment analysis.However,these works often ignore the interaction of the related information between the text and the image,and the combined data contains a lot of redundant information which is irrelevant to emotion analysis.To solve the above problems,this paper studies feature extraction and multi-modal data feature fusion,and proposes two multi-modal emotion analysis models:(1)This paper introduces the basic concept of deep learning and the popular machine learning algorithms,deeply studies the architecture and characteristics of the mainstream deep learning models,and fully understands the important work of applying these deep learning models to multi-modal data feature fusion.At the same time,the main task of this paper is to use feature fusion algorithm to analyze multimodal data,and the ultimate goal is to build a perfect multimodal emotion analysis model.(2)This paper proposes a multi-modal interaction feature fusion model based on attention mechanism,which effectively fuses the features of different modes and provides more effective and accurate information for emotion classification task.Firstly,text features are extracted by using sparse noise reduction self encoder.Secondly,image features are extracted by using variational self encoder.Finally,an interactive feature fusion module based on fine-grained attention mechanism is constructed to make text features and image features learn their internal information from each other,and then the fusion features are applied to emotion classification task.The model reduces the noise interference in the text data,and extracts the image features which are more important for emotion analysis.The most useful emotion information of the two is fused together through the feature fusion module,which makes effective use of the internal information and association information of the text and image data.(3)A multi-modal feature fusion method based on tensor fusion and gated convolution mechanism is proposed to solve the problem of fusion information redundancy in multi-modal emotion classification task.Firstly,the best pre training model is used as a text feature extractor to extract text features that can represent the essence of language;Secondly,convolution neural network based on attention mechanism is used to extract image features and highlight local features with important emotional information in image data;Thirdly,tensor fusion network is used as feature fusion tool of multimodal data,and tensor product of text feature and image feature is used as fusion feature expression;Finally,a specific information extraction module is designed to extract the fusion features again by using the gating convolution mechanism,which effectively eliminates the redundant information in the joint features.
Keywords/Search Tags:Deep learning, multimodal data, sentiment analysis, feature fusion
PDF Full Text Request
Related items