Research On Recommender Systems By Utilizing Multi-Modal Data User Modeling

Posted on:2024-09-24

Degree:Master

Type:Thesis

Country:China

Candidate:A B Li

Full Text:PDF

GTID:2568307124960079

Subject:Electronic information

Abstract/Summary:

PDF Full Text Request

Recommendation systems are an important means of addressing the problem of information overload by recommending content that matches users’ interests from a large amount of information.With the rapid development of streaming media,the types of user information are becoming more diverse,evolving from initially single structured text information content to multimodal information(text,image,audio,and video).However,in recommendation systems,the problem of data sparsity is always inevitable,regardless of whether it is single-modal or multimodal data.This thesis addresses the issues of underutilized textual multi-view information,poor expression and transferability of visual features extracted from images,and semantic communication between multimodal data in existing recommendation systems,and conducts research from three aspects: multi-view text feature extraction,image feature extraction,and multimodal data knowledge sharing.Finally,by integrating textual and image data modes,a personalized movie recommendation system based on multimodal data is constructed.The research content of this thesis mainly includes the following three aspects:Proposed a multi-view text personalized recommendation model that combines CNN and multi-head self-attention,which solves the sparsity issue caused by the insufficient use of multi-view information in the text feature extraction process from both text and user representation aspects.In the multi-view text feature extraction module(ECMSA),the model extracts features from multiple components of the text to explore the maximum amount of information carried by the text.Then,using embedding techniques to reduce the dimensionality of multi-view text vectors as input to CNN,the model extracts features from the short-range context of words in the text.Next,the extracted feature vectors are input into the multi-head self-attention network to extract features over a longer range and model the relationship between multi-view text vector features.Finally,the model uses additional attention to select features.In the user representation learning module,the model uses a multi-head self-attention network to model the interaction between the user’s historical click data and enhance the user representation.The model then uses a click predictor for model prediction,achieving an AUC of 0.6383 and n DCG@5 of 0.3586 in experiments,outperforming other comparative methods.Proposed a personalized recommendation model that integrates text and image data to construct a multi-modal data recommendation model for user profiling.The model consists of three parts: Firstly,the ECMSA framework is used to extract text features.Secondly,the IEMSA framework is proposed to address the problem of poor image feature representation and transferability.The model uses Inception V3 for visual feature extraction,then reduces dimensionality using embedding techniques,and learns the mutual relationships between different visual features using multi-head self-attention.Thirdly,a multimodal knowledge sharing module is introduced to address the issue of information exchange between different modalities using cross-modal attention.Furthermore,additional attention is incorporated in both text and image feature extraction processes to select feature vectors with greater information content.Finally,the prediction model outputs the result.Experimental results demonstrate that the proposed method achieves a lower loss value of 0.3301 and a higher AUC value of0.8892,outperforming other comparison methods.Finally,based on the above research content,this thesis integrates them into a framework and designs and implements a personalized movie recommendation prototype system based on multimodal data using the public dataset Movie Lens.The system achieves personalized recommendation function through multimodal(text-image)data user profiling theory,and also includes popular recommendation and similarity recommendation functions,which constitute the three main functions of the system.The experiment demonstrates that the system has good performance.

Keywords/Search Tags:

Recommender Systems, Deep Learning, Multimodal Data, Attention Mechanisms

PDF Full Text Request

Related items

1	Research On Multimodal Model Based On External Attention Mechanis
2	Research On Deep Learning Based Recommendation System Models And Explainability
3	Research And Implementation Of Multimodal News Recommender System Based On Deep Learning
4	Research On Attention Collaborative Autoencoder Of Recommender Systems
5	Multimodal Data Representation And Applications On Content Curation Social Networks
6	Research On Multimodal Emotion Analysis Method Based On Deep Learning
7	Research And Practice Of Recommendation System Based On Graph Neural Networks
8	Research And Implementation Of Recommendation System Based On Multi Graph Neural Networks
9	Research On Personalized Recommendation Algorithm Based On Deep Learning
10	Research Of The Deep Learning Based Intelligent Recommender System