Font Size: a A A

The Application Research Of Sampling Methods In Large-scale Networks

Posted on:2016-04-28Degree:MasterType:Thesis
Country:ChinaCandidate:G N WangFull Text:PDF
GTID:2180330473955877Subject:Computer software and theory
Abstract/Summary:PDF Full Text Request
With the deepening understanding of static and dynamic characteristics in complex networks, people not only proposed indicators and calculation methods to describe networks, but also designed some analysis methods such as link prediction, community detection algorithms to mine valuable information from networks. However, as the scale of networks become larger, many of those methods are not suitable for large-scale network analysis due to the high time complexity and space complexity. One of the solutions to this problem is to use network sampling, and obtain approximate results based on sampled data sets at relatively low consumption levels. However, most existing sampling methods are based on single-phase sampling framework, which is difficult to accurately depict a complex network with links derived from mixture distributions, and unable to meet the needs of specific research areas. In this paper, we propose efficient methods based on novel sampling strategies for calculating shortest paths and binary relationship prediction in large-scale networks.It is a critical issue to compute the shortest paths between nodes in networks. Exact algorithms for shortest paths are usually inapplicable for large-scale networks due to the high computational complexity. In this paper, we propose a novel algorithm that is applicable for large networks with high efficiency and accuracy. The basic idea of our algorithm is to iteratively construct higher-level hierarchy networks by condensing the central nodes and their neighbors into ‘super nodes’ until the scale of the top-level network is very small. Then the algorithm approximates the distances of the shortest paths in the original network with the help of super nodes in the higher-level hierarchy networks. The experiment results show that our algorithm achieves both good efficiency and high accuracy compared with other algorithms.With the development of Internet technology, social networks have attracted more and more people as the main information exchange platform. In a social network, users hold and express positive and negative attitudes(e.g. support/opposition) towards others. Those attitudes exhibit some kind of binary relationships among the users, which play an important role in social network analysis. However, some of those binary relationships are likely to be latent as the scale of social network increases. The essence of predicting latent binary relationships has recently begun to draw researchers’ attention. In this paper, we propose a machine learning algorithm for predicting positive and negative relationships in social networks inspired by structural balance theory and social status theory. To deal with large-scale social network data, we employ a sampling strategy that selects small amount(1%) of training data while maintaining high accuracy of prediction. We compare our algorithm with traditional algorithms and adaptive boosting of them. Experimental results of typical data sets show that our algorithm can deal with large social networks and consistently outperforms other methods.
Keywords/Search Tags:complex networks, network sampling, shortest path, social networks, binary relationships
PDF Full Text Request
Related items