Font Size: a A A

Research On Link Prediction In Heterogeneous Information Networks

Posted on:2021-03-09Degree:MasterType:Thesis
Country:ChinaCandidate:Y C PengFull Text:PDF
GTID:2428330611498842Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
Network is a model that describes the relationship between entities.Different entities in the world interweave to form different networks.The link prediction problem aims to predict whether there are links between nodes in the future network by studying and analyzing the network formed in the historical period.It has great significance in network research area.Through link prediction technology,companies running social software can recommend friends to their users,biomedical researchers can use it to discover unknown protein interactions,and academics in academic networks can find others with similar research to work with.The traditional network modeling method of single type node and edge is too simple so that it is easy to ignore the relevant information existing between different entities.Therefore,scholars begin to study heterogeneous information networks which contain multiple types of nodes and multiple types of edges.This thesis mainly studies link prediction problem in heterogeneous information networks.The existing methods have some shortcomings in the generation of relevant meta-path sets,the measurement of meta-path similarity,and the method of integrating the similarity of different meta-paths.In terms of generating relevant meta-paths,LLMG(Limit Length Meta-path Generating)algorithm is designed in this thesis to generate the relevant meta-paths automatically,which avoids the traditional disadvantage of manually selecting relevant meta-paths based on prior knowledge.In this thesis,the HLE-T(Heterogeneous Link Entropy with Time)algorithm is designed based on the comprehensive consideration of link entropy and dynamic time information,taking into account the different influences of different nodes on the path instance in the meta-path instance and the time label problem formed by the path instance.This thesis improves the traditional supervised learning link prediction model based on binary classification,design the MSLP(Modified Supervised Link Prediction)algorithm,the training focus on nodes of information no longer simple marking is 0/1,but a link strength value is assigned to the node pairs in the training marking phase by using the second-order weighted path size between the node pairs to be predicted in the network's shadow casting network on the target meta-path,thus more reasonable use of the information in network.On the Aminer data set of DBLP network,this thesis divides the network into four data sets according to the quantity of papers written by the author.AUC index,prec@20 index and prec@100 index are the evaluation metrics in this thesis.The results show that the our algorithm can achieve a good prediction result when limit-length is 4.Meanwhile,the performance of the HLE-T algorithm is greatly improved compared with the traditional meta-path based similarity algorithm.Compared with the traditional binary classification link prediction model,the MSLP algorithm has a small improvement in the AUC score and prec@L index in the four data sets.
Keywords/Search Tags:link prediction, heterogeneous information network, meta-path, node similarity
PDF Full Text Request
Related items