Font Size: a A A

Research On Boosting Based Method Of Link Prediction In Heterogeneous Information Network

Posted on:2018-01-10Degree:MasterType:Thesis
Country:ChinaCandidate:X DongFull Text:PDF
GTID:2348330542490944Subject:Engineering
Abstract/Summary:PDF Full Text Request
Link prediction is defined as predicting the possibility of connection between two nodes in the network that have not been connected by the existing network structure information.With the arrival of the era of Big Data,how to give full play to the role of link prediction in the field of network analysis has become the frontier of research.However,most of the existing link prediction methods are studied towards the homogeneous information network.The existing link prediction methods are more difficult because of the great complexity of the heterogeneous information network nodes and their connection,which have the problem as lower accuracy and recall rate.As a consequence,how to effectively improve the performance of link prediction methods to the heterogeneous information network has a vital practical significance.Boosting algorithm is one of the most successful methods in machine learning,and has a significant effect on improving the accuracy of a given algorithm.In this paper,the NBL algorithm of heterogeneous information network link prediction method based on Boosting is proposed,which introduces Boosting into the field of link prediction and improves the performance of heterogeneous information network link prediction method through the idea of integrated learning.Firstly,the Boosting algorithm is improved on the characteristics of heterogeneous information network in two aspects.The first is that the training speed of the algorithm is accelerated by the way of sample selection,which avoids the problem that the training time is too long.The second is the additional threshold for the Boosting algorithm,which limits the growth of the noise sample weight and prevents the emergence of over-fitting phenomenon.Then,several kinds of traditional link prediction methods are collected as the basic learning algorithm,and the performance of each method is analyzed experimentally.Through the improved Boosting algorithm,the results of the traditional forecasting methods are weightedly merged,and the voting factors are given according to their classification ability.Finally,the rule of "member voting" is used to determine the possibility of establishing the connection of each node.In order to verify the feasibility of the method,three kinds of real heterogeneous information network data sets which refer to DBLP,movielens and Last.fm are used to evaluate the link prediction algorithm proposed in this paper.Compared with the predictionresults of the traditional heterogeneous information network link prediction method,the improved method proposed in this paper guarantees the accuracy of the forecast while improving the recall rate.Therefore,it has a good prediction result and also validates the rationality of the algorithm.
Keywords/Search Tags:heterogeneous information network, Link prediction, Boosting algorithm, weighted fusion
PDF Full Text Request
Related items