Font Size: a A A

Research On Approaches And Applications Of Link Prediction In Complex Networks

Posted on:2018-09-14Degree:DoctorType:Dissertation
Country:ChinaCandidate:J Y DingFull Text:PDF
GTID:1360330542493474Subject:Circuits and Systems
Abstract/Summary:PDF Full Text Request
The investigation of link prediction in networks is an important issue in many disciplines.Link prediction helps us not only to understand the evolution mechanism of the complex networks theoretically,but also to solve very important issues in applications.For example,it can be used in the field of information integration,social network analysis,recommender systems and bio-informatics.The research of prediction algorithms which require short time but high accuracy is still a challenging task.Most of the existing algorithms are based on the topological information of the networks,including the local or global similarity indices.The prediction algorithms based on the local similarity information of the network have low time complexity,but the accuracy is not so good;while the link prediction algorithms based on the global similarity information of the network are just the opposite,they improve the prediction accuracy,but at the same time the complexity also increases.So more and more researchers explore the compromise of the accuracy and the complexity,and want to design a more efficient link prediction algorithm.At the same time,there are more scholars explore the effect of the characteristics of the complex networks on the link prediction algorithm,such as power-law distribution characteristics and the small world phenomenon.As the important characteristics of the complex networks,the hierarchical structure information and the community structure information may indeed provide insights for link prediction,but the currently existing algorithms did not make full use of these latent information of the network.So,in this paper,we extract the community structure of the network as possible as we can.And propose three link prediction algorithms based on the information we obtained;In addition,we analyze the networks which suit to our methods;Finally,we put forward an effective community detection algorithm by using the idea of the methods which used to solve the problem of link prediction.(1)A link prediction method based on the multi-resolution community division is proposed to solve the link prediction problem in the network.This is the first time to solve the link prediction problem in complex networks by using the multi-resolution community structure information of the network.There are three main steps in our algorithm.The first step of our work is multi-resolution community division.In order to avoid the problem of resolution limit,and to get enough statistical samples,we use the method based on the optimization of modularity density to detect communities of the network.To get the highest precision is the aim of the traditional community division methods,so it is difficult to control the running time.In this chapter,it’s different from the traditional community division methods,the proposed method is only care about the diversity of the partition results,thus can be greatly reduce the time complexity.The second step is to calculate the frequency of each pair of vertices divided into the same community.We assume that the community divisions under different resolutions have different impact factors for prediction,so we give different weighted value to them.Finally,we estimate the probability of each pair of nodes according to the conversion formula between frequency and probability.If a pair of vertices has a greater probability,they are more likely to be connected by a missing link.These pairs are regarded as the most likely candidates for missing connections.The performance of our algorithm is demonstrated by comparing with other state of the art methods on benchmark networks and real networks in different scales.The results indicate that our approach is effective and accurate.(2)An algorithm based on the community relevance and ruler inference is proposed to predict missing links in the networks.We advocate a shift in perspective from nodes to communities,and in particular from nodes similarity to community relevance.Our method is on the new philosophy in the formulation of community-based indices.It focuses on the relationship between different communities,especially the impact of this relationship on link prediction.There are three main steps in our method.Firstly,extract the community structure of the network.In order to decrease the time complexity,we design a fast community division method based on the local information of the network.Secondly,calculate the relevance of each pair of communities by using the new community relevance indices.Here,the community relevance indices is used to describe the closeness between different communities,so the greater the value of the indices is,the more closeness the two communities are;Finally,a simple prediction model based on ruler inference is applied to estimate the probability of the missing links.It is shown that the community relevance features improve the predictor with low time complexity.Moreover,with experiments on benchmark networks and real-world networks in different scales,and compared with other ten sate of the art approaches,we find that the proposed method overcomes the drawbacks of the traditional similarity algorithms,that is,the probability of the link between two nodes in different communities is exactly the same as the similarity between them.(3)Through the compromise between the time complexity and the accuracy we design an unsupervised algorithm based on community relevance to predict the missing links in the networks quickly and accurately.We consider more information of the network structure,including part of the global information of the network,and combination the local and global information effectively to design the community relevance indices,the accuracy of the algorithm can be obviously improved.There are three steps in our method.Firstly,get the initial partition according to the COMM-ST algorithm.Then calculate the closeness between different communities according to the designed community relevance indices which include the local and global information of the network;Finally,predict the existence probability of the missing link according to the community similarity matrix.With experiments on benchmark networks and real-world networks in different scales,we find that it overcomes the limitation of the performance of the prior algorithm by the consistency of the degree distribution of the network,and the experimental results show that the new algorithm has a higher accuracy under the acceptable time complexity.(4)Analyze the two major problems of the link prediction algorithms based on community relevance.One problem is the requirement of the consistent degree distribution of the network.The other is the over fitting problem of the algorithms.According to the problems,we propose the hypothesis,and then through a large number of experiments to verify the correctness of the hypothesis and to find the critical point of the problems.The results showed that when the training set is not enough,the performance of the algorithm based on the local similarity indices will decline,because that the degree distribution of the network is inconsistent;When the training set is too much,the performance of the algorithm based on the semi global similarity and the global similarity indices will decline,because that the number of the efficient solutions in the relevance metric is too much.In addition,we quantitatively analyze the critical points in which the over-fitting phenomenon appeared.It makes sense to avoid over-fitting problem.(5)Most traditional community detection algorithms are based on modularity optimization.However,these methods not only have disadvantages in computational complexity,but also have the problem of resolution restriction.To solve this problem,a new community detection method based on the community relevance is proposed.The main idea of this method is to get a reasonable division of the network by merging the communities which have the biggest community relevance.This method utilizes correlation between communities as a consolidation strategy.Firstly,initialize a community partition based on the distribution of node degree;Secondly,calculate the similarity value between different communities.Here,the similarity is the index to describe the closeness of the different communities.We believe that the closer the two different communities are,the greater the likelihood of being divided together;Finally,merge the pairs of communities which has higher similarity value as possible as we can and stop when the condition is not satisfied.Because the convergence of our algorithm is very fast in the process of merging,we find that our method has advantages both in the time complexity and the accuracy when compared with other six classical algorithms.Moreover,the consolidation strategy of this method can also be a measure to describe the difficulty of network division.
Keywords/Search Tags:link prediction, community detection, complex networks, machine learning, data mining
PDF Full Text Request
Related items