Font Size: a A A

Meta-path Based Nodes Similarity Measure In Heterogeneous Information Networks

Posted on:2019-11-04Degree:MasterType:Thesis
Country:ChinaCandidate:Q C WuFull Text:PDF
GTID:2428330590992323Subject:Electronics and Communications Engineering
Abstract/Summary:PDF Full Text Request
Recently,booming of internet industry makes analyzing of large scale information networks hot spot in academia and industry.Meanwhile,influenced by node types variety associated with object entities under real situation,heterogeneous information networks(HINs)attract special attentions.HINs allow nodes of different types to be interconnected,this feature is close to real states of objects and thus makes the networks carry richer structural and semantical information.Among these applications in HINs related domains,nodes similarity measure has special meaning,because it lays the foundations for studies such as recommendation system,information retrieval and link prediction,etc.In this paper,we will systematically summarize works related to nodes similarity measure in HINs,and we also proposed a combinational meta-paths mining algorithm for path constrained similarity measure.Furthermore,we also introduce methods widely used in graph representation learning,and then extend meta-paths constraining to graph embedding.The studies in this paper are summarized as follows:In the beginning,we give formal definitions of basic concepts frequently used in this thesis,then classical node similarity measures are introduced in homogeneous and heterogeneous contexts.In the homogeneous scenarios,feature based and link based methods are included.But meta-paths based algorithms are more dominant in heterogeneous contexts to revel correlation relationships between nodes.Then,beyond the scope of typical node similarity measure methods,we import graph representation learning as a solution,which converts node similarity measure problem into node vectors generating problem.Behind network representation learning,we concentrate more on simple neural networks based algorithms which aims at embedding a graph to get low dimensional vector representation of nodes in information networks.Experiment with DBLP dataset proved their unique advantages in similarity measure.Meanwhile,we extend meta-path concepts into graph embedding research to generate node vectors that carry richer structural and semantical information,and such vectors are helpful to node similarity measure.At last,we concentrate meta-path based similarity measure and propose combinational meta-path mining(CMPM)algorithm to generate semantically rich meta-paths.The algorithm relies on the assumption that short meta-paths carry more significant semantics and path instance distribution information is vital for meta-paths weighting.The node similarity measuring results can be improved with the resulting meta-paths.
Keywords/Search Tags:Heterogeneous Information Networks, Network Representation Learning, Meta-Path, Nodes Similarity Measure
PDF Full Text Request
Related items