| Graph streams are massive,rapidly changing and endless graph structure in which nodes and edges arrive quickly in the form of streams.These properties of graph streams make it difficult to store in memory or even on disks.Many applications can be represented as graph streams,such as social networks and E-business.In these fields,link prediction is a very important problem,which aims to estimate the likelihood of the existence of a specific link.However,in graph streams,link prediction for vertex is more common.For example,social networks generally want to recommend several friends to a user rather than determining whether a specific user is your friend.Rapidly and accurately predicting groups of links becomes a formidable challenge because of the tremendous size and rapidly updated information of graph streams.TPLP(Two-Phase Selection Link Prediction System)aims to predict the top-k vertices that are most likely to connect to the target vertex in graph streams.To improve link prediction efficiency,TPLP uses a two-phase selection algorithm to obtain candidates of the target vertex.TPLP also proposes an algorithm for estimating common neighbor in graph streams,which is a very important measure in link prediction.Furthermore,TPLP implements the node clustering coefficient in graph steams and uses it for link prediction.To test the efficiency and accuracy of the algorithm,the experimental part uses four real-world,public datasets that can be formulated as graph streams: Amazon dataset,DBLP dataset,Wikipedia dataset and Super-User dataset.The experimental results demonstrate that our algorithms are more efficient and accurate than state-of-the-art approaches,and thus can be applied to real-word graph stream applications. |