Font Size: a A A

Research On Methods For Detecting Community Structure Based On Multivariate Statistical Analysis

Posted on:2015-07-05Degree:DoctorType:Dissertation
Country:ChinaCandidate:L LiFull Text:PDF
GTID:1220330476453897Subject:Communication and Information System
Abstract/Summary:PDF Full Text Request
Detecting community structure is one of advanced hot topics at theory research of the complex network. Revealing these communities can give a significant insight into structure and functional features of the complex systems, uncover the topology properties in the complex networks and understand the evolutionary behavior in dynamic networks. Thus the community detection can be used in information retrieval and recommendations, information propagation and control, Organization management and any other fields. With the development of complex network theory and application, more and more researchers pay attention to large-scale networks with complicated topology structure which is a great challenge to the performance of the existing algorithms. Meanwhile, the discovery of hierarchical and overlapping community structure proposes the functional requirements for the existing algorithms. Therefore, it is quite necessary to design accurate and efficient algorithms to uncover hierarchical and overlapping community structure in the complex networks. As a branch of statistics, multivariate statistical analysis can analyze the inherent characteristics and statistical rules of the variates when the variates are relevant. Since complex networks consist of different communities, there are inherent relationships among communities which lay a solid foundation for appling multivariate statistical analysis to detect community structure. In this thesis, we use multivariate statistical analysis to model network topology structure, design and improve some efficient community detection algorithms in terms of accuracy, time complexity and method functions. Our main achievements are as follows:1. Proposed two improved spectral algorithms based on PCA to detect community structure in the complex networks. Apply the PCA to analyse topoplogy information of networks, and then use presuppose threshold to decide the optimal number of eigenvectors in spectral analysis, finally uncover community structure in the complex networks. In order to reveal overlapping communities, this thesis use the statistics feature of the PCA to decide the optimal number of eigenvectors self-adaptively, then calculates the Laplace matrix and choose the eigenvectors to map nodes into low dimension subspace. At last, the FCM is used to discover overlapping community structure in the complex networks.2. Since MDS can preserve the similarity information after mapping into Euclidean space, we use MDS to improve the classic cluster algorithms to detect community structure. We use local topology feature to calculate distance matrix and then mapping nodes into Euclidean space. Based on the definition of community structure, we proposed the local expand k-means to cluster nodes into communities. We use shortest path and FCM to improve the forementioned algorithm to detect overlapping community structure.3. Based on factor analysis theory, we model nodes and communities with random variables, and then proposed the community division algorithm and micro-community merging model. Use factor analysis to uncover the similarity between nodes, and then proposed the definition of local weak edge to identify edges between two communities. After iteratively removing the local weak edges which satisfy given math conditions, the communities can be split. In order to uncover overlapping communities in the complex network, we use factor analysis to map nodes and communities into random variables and latent variables. After finding density-based micro-community structure, the propsosed model merges these reasonable micro-communities iteratively to form communities. Simulation results show that the proposed model can detect overlapping communities and identify different roles of nodes in communities.4. After analyzing the defects of the existing label propagation algorithm(LPA), we propsed the improved LPA based on node variance. After using local topology structure to calculate the weight of each edge, we identify the nodes belonging to different communities with node variance and the tunable threshold. Finally, we update the label of each node based on the size of community and connection strength between node and communities. In order to uncover the physical meaning of the tunable threshold, we research the trend of the node variance in each iteration and define average node preference. Based on the average node preference, we proposed an improved LPA to uncover overlapping communities. After changing the tunable threshold, the proposed algorithm can reveal communities with different overlapping scopes.
Keywords/Search Tags:Complex network, Community structure, Multivariate statistics, Principal Component Analysis, Factor analysis, Multidimensional Scaling, Label propagation
PDF Full Text Request
Related items