Research On Short Video Recommendation Algorithm Based On Multimodal Graph Convolutional Network

Posted on:2024-05-23

Degree:Master

Type:Thesis

Country:China

Candidate:Q S Li

Full Text:PDF

GTID:2568307067499884

Subject:Engineering

Abstract/Summary:

PDF Full Text Request

The short video platform is an important interactive platform developed in the Internet era,and the personalized recommendation algorithm is the key to the development of the short video platform.Its main purpose is to accurately locate user preferences,so as to achieve accurate recommendations and help users save search time.However,compared with text content,short video modal information is rich,and there is a "semantic gap" problem in the application of traditional recommendation algorithms,resulting in inaccurate user preferences captured by them,which directly affects the platform’s recommendation effect and user experience.To this end,this paper starts with multi-modal features and studies short video recommendation algorithms based on graph convolutional network algorithms to improve the recommendation effect of short video platforms.On the basis of introducing the related theories and technologies of multi-modalit y,graph convolutional network and short video recommendation algorithm,this paper proposes a theoretical framework of multimodal graph convolutional network short vi deo recommendation algorithm,and conducts the corresponding "aggregation layer".Algorithm design of "integration layer","fusion layer and output layer".Then point ou t the shortcomings of its application,and get some suggestion and some point of impr oved.Besides,comparative experiments are carried out using public data sets to demo nstrate the effectiveness and reliability of the algorithm proposed in this paper.It is found that the theoretical model of personalized recommendation based on multi-modal graph convolutional network algorithm mainly includes "user-short video bipartite graph","aggregation layer","integration layer","fusion layer and output layer" and so on.On the basis of this theoretical framework,improvements and experimental comparisons are proposed for the existing deficiencies,mainly including the following aspects:(1)In terms of aggregation layer,as the number of network layers gradually increases,the information transmission efficiency will decrease continuously and the transition smoothing problem will appear.In this paper,on the basis of aggregating the first-order neighborhood of user and short video target vertices,an aggregation path from second-order neighbors to the target vertices is added to achieve two-level aggregation.At the same time,aiming at the shortcomings of the average aggregation and maximum pooling aggregation algorithms in the current aggregation layer,this paper proposes an aggregation algorithm based on attention,which uses the attention index to represent the similarity between the target vertex and the neighborhood node,and guides the target vertex to learn according to the size of the similarity.(2)In the integration layer,the algorithm ignores the difference of the impact of the target vertex attribute information and the structure information on the user preference,which leads to the problem of incomplete information integration in the aggregation layer.In this paper,the outer product integration strategy is adopted and negative sampling method is used for unsupervised optimization.(3)Experiments with Kuaishou,Tiktok and Movie Lens on three public datasets show that the multi-modal graph convolution network short video recommendation algorithm has better application effect than the single modal graph convolution short video recommendation algorithm.The test results of different algorithms on three datasets show that the proposed algorithm outperforms GAT,Graph SAGE,EGES and Deep Walk algorithms.Comparative experiments were conducted on three datasets with different aggregation strategies,aggregation algorithms and integration algorithms.The results show that the application effects of the proposed strategies and algorithms are better than those of the traditional strategies and algorithms.

Keywords/Search Tags:

Multimodal, Figure convolution, Short video, The aggregation layer

PDF Full Text Request

Related items

1	The Construction Of The National Cultural Image By The Short Video Program "China Mosaic" From The Perspective Of Multimodal Discourse
2	Video Semantic Analysis Based On Multimodal Features
3	The Key Technology Research Of Short Video Multimodal Retrieval Based On Deep Learning Technology
4	Research On Multimodal Emotion Recognition Technology
5	Research On Video Captioning Methods Based On Visual Text Association And Multimodal Feature Fusion
6	Interesting Production And Cultural Representation:Discourse Analysis Of Short Video Of "Taste"
7	Multimodal Semantic Alignment For Referring Video Segmentation
8	The Path Of Constructing Urban Image From The Perspective Of Multimodal Metaphor Of Chongqing’s "Shuang Shai" Short Videos
9	Research On Video Content Analysis Method Based On Multimodal Feature Fusion
10	Research On Camouflaged Target Detection Algorithm Based On Multi-layer Feature Aggregatio