Font Size: a A A

Community Detection And Multi-level Visual Analytics For Massive Networks

Posted on:2012-09-16Degree:DoctorType:Dissertation
Country:ChinaCandidate:Q YeFull Text:PDF
GTID:1480303356972959Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
A large number of complex systems can be modeled using complex networks, such as friendship networks, co-authorship networks, on-line social networks, transportation networks, protein interaction networks, metabolic networks, etc. With the development of the science of complex networks, its focus has shifted from understanding the macrospotic characteristics to mesoscopic properties of networks. Community structure is one of the key mesoscopic structural properties of complex networks, and an impressive amount of work has been done on the issue of community detection.A large number of algorithms have been proposed in recent years to explore these link data, however, the high computationally demanding and memory requirement of many algorithms limits their applications. In this thesis, we attempt to explore the patterns in real-world network by proposing new low computational cost and low storage cost algorithms to measure and characterize real-world networks. We propose two novel community detection algorithms and a fast average shortest path length estimation algorithm, and we also propose an empirical study on community quality scores and their sizes on a large number of massive real-world networks. In addition to the conceptual developments, we provide a general purpose analytical platform for analyzing different kinds of patterns in real-world complex networks. The main contributions of this dissertation include the following:1. We propose an iterative heuristic algorithm to extract the community structure in large networks based on local multi-resolution modularity optimization whose time complexity is near linear and space complexity is linear. The effectiveness of our algorithm is demonstrated by extensive experiments on lots of computer generated graphs and public available real-world graphs. We show that our algorithm is faster and more accurate than most widely used community detection algorithms. We can explore the massive graph interactively in multi-scale views.2. We propose a highly efficient multi-scale link community detection algorithm based on link partition approach whose time complexity is near linear and space complex is linear. Furthermore, by tuning the resolution parameter one can probe the network at different scales, exploring the possible hierarchy of community structure. To evaluate the effectiveness of our algorithm, we give lots of experiments. The result shows that our algorithm is much more accurate than the famous ABL algorithm. It is also faster and more accurate than most current node partition algorithms.3. We propose an empirical study on community quality scores on a large number of massive complex networks. To our surprise, we even find that there is no significant relation between community sizes and their quality scores for most extracted communities. We also find that some very huge community with high mean quality values, and we can not find the universal "V" shape in their mean quality values. We also show that the well used metrics, such as the Conductance metric, Expansion metric and normalized Cut metric have the resolution problem. We define a new metric called mean network community profile (MNCP) based on NCP to measure the quality of communities of different sizes.4. Through the empirical study of a lot of real-world networks, we find out the vertex-vertex distance distributions in these networks conform well to the normal distributions with different means and variations, and we suggest a simple and fast vertex sampling algorithm to estimate the average shortest path length. Our algorithm is very fast and scales well to both of the massive graphs and the different pure topology artificial networks. The results show that we can estimate the average shortest path lengths quickly by just sampling a small number of vertices in massive networks.5. In addition to conceptual developments, we develop a network analytical platform called NeSVA for analyzing and detecting patterns in real-world complex networks. Our approach is empirically driven and the framework includes the following analysis strategies:network generating, structure statistical analysis, community extraction, network visualization. Based on these strategies, NeSVA incorporates several advanced techniques:graph modeling, link prediction, centrality analysis, community detection, graph drawing, etc. NeSVA is not only a general purpose analysis platform for empirical study but also a platform for evaluating the existing and newly proposed network analytical algorithms.Overall, the distinct feature of this dissertation is the study of methodology for community detection and multi-level visual analytics in massive complex networks. Innovations of the dissertation can enhance our ability to explore massive networks in multi-level views quickly and accurately.
Keywords/Search Tags:Complex Networks, Network Science, Graph Mining, Community Detection, Visual Analytics
PDF Full Text Request
Related items