Font Size: a A A

Research On Heterogeneous Information Network Representation Learning Method

Posted on:2021-03-11Degree:MasterType:Thesis
Country:ChinaCandidate:L ZhouFull Text:PDF
GTID:2428330611950423Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
In the era of big data,the amount of data is growing rapidly,and the forms of data are ever-changing.Heterogeneous information network,as one of the common data forms,appears in all aspects of life,such as social network,document information network,movie information network,etc.Heterogeneous information networks are large in scale and complex in structure,which contain abundant information.How to represent them is the key to tap their potential value.In recent years,heterogeneous information network representation learning has become a research hotspot.Its goal is to represent nodes in heterogeneous information networks with low-dimensional,dense and rich heterogeneous real-valued vectors,and apply the resulting node representation vectors Subsequent data mining tasks to mine the potential value in heterogeneous information networks.So far,research scholars have proposed many heterogeneous information network representation learning methods.These methods have promoted the development of heterogeneous information network representation learning,but these methods still have deficiencies.This article analyzes and researches the shortcomings of the existing heterogeneous information network representation methods.The main research work is as follows:The existing heterogeneous network representation learning method based on Generative Adversarial Network(GAN)does not consider retaining the second-order semantic information of the heterogeneous information network.The obtained node representation vector contains insufficient information,which leads to the lack of effectiveness of downstream data mining tasks good.In response to this problem,this paper studies a heterogeneous information network representation learning method incorporating with second-order semantic information.A GAN model is designed.The generation model of the GAN model generates fake meta-paths that are close to the true sampling meta-path as much as possible,and its discriminant model discriminates the input data as true or fake meta-paths as much as possible.In the game,the node representation vector containing both first-order and second-order semantic information is finally obtained.Experimental results on two real data sets,DBLP and Aminer,show that the effectiveness of the method in the two data mining tasks of node classification and link prediction is improved.The existing heterogeneous information network representation learning method is dedicated to retaining the semantic information of the heterogeneous information network,but lacks the reservation information of the heterogeneous information network node distribution,and obtained node representation vector lacks the node distribution information,resulting in the poor performance of the downstream data mining tasks good.Aiming at this problem,this paper studies a heterogeneous information network representation learning method incorporating with node distribution information.A GAN model was designed.The generation model of the GAN model generates a fake matrix that retains semantic information and is close to the prior distribution matrix.The discriminant model discriminates as much as possible whether the input data is a generated fake matrix or a sampled prior distribution matrix.In the game,the node representation vector that retains multi-level semantic information and node distribution information is finally obtained.The experimental results on multiple real data sets show that the representation learning effect of heterogeneous information network is improved by introducing node distribution information into heterogeneous information network.
Keywords/Search Tags:Heterogeneous information network, Network represents learning, Generative Adversarial Networks, Meta-path
PDF Full Text Request
Related items