| Complex network is a network system with complex topological structure and complex node behavior,which is the abstraction of a variety of large-scale complex systems in real world.The node of a complex network can be the basic entity of the network system with arbitrary specific dynamics and information connotation,and edge represents the relationship between these basic entities.Complex network is known as the rich and low-order connection patterns that can be acquired at the level of nodes and edges,and community structure is one of the most important patterns.Currently,exploring community structure in complex network has become one of the hottest research topics in modern complex network science.Studying community structure in complex network will contribute to identifying the underlying topologi-cal structure more clearly,promoting the rapid development of scientific research on complex network from macro level to micro level,which have important theoretical significance and practical application value.At present,with the increasing of the real network scale,a large number of global community detection algorithms have been unable to meet the requirement of the era due to excessive time and space complexity,how to design global community detection algorithms for large complex network is an important problem to be solved urgently.On the other hand,digging local community structure is especially necessary for large complex network,how to design fast local community detection algorithms is a new research hot in recent years.In the light of the problems above,we design global overlapping community detection algorithm and local community detection algorithms based on local spectral approximation for large complex network,the main work and innovation points of this dissertation are as follows:(1)For the global overlapping community detection problem in large complex net-work,we propose an algorithm called Quadratic Optimization-based Clique Expansion(QOCE)for uncovering the global overlapping community structure.QOCE includes four key steps:maximal cliques enumeration,random walk sampling,quadratic opti-mization and community detection.Firstly,making use of the Bron-Kerbosch(BK)clique enumeration algorithm to efficiently find all maximal cliques with at least three nodes due to its applicability for large complex network.After sorting all maximal cliques by their sizes in descending order,we remove any cliques if they overlap with a previous clique by at least 75%nodes.Secondly,in order to reduce the computational complexity of QOCE,we adopt a fast random walk with the maximal clique as the initial nodes to sample a local subgrph,so that the sampled subgraph could cover all the nodes in the neighborhood of the maximal clique as much as possible.Then,based on the Cheeger cut minimization,we design the quadratic objective function with l1regular term on the sampled subgraph and solve it by using numerical algo-rithm.Finally,community truncation selection is carried out according to the optimal solution vector of quadratic objective function.QOCE is not only suitable for large complex network but also can detect the overlapping community structure due to the overlapping phenomenon in different maximal cliques expansion processes.Extensive experimental results show that the community detection accuracy of QOCE is higher than that of several benchmark algorithms for comparison on real-world networks as well as synthetic datasets.(2)For the local community detection problem in large complex network,we propose a new approach called Krylov Subspace Spectral Approximation(KSSA)for local community detection.KSSA includes three key steps:Breadth First Search(BFS)sampling,Krylov subspace spectral approximation and local community detec-tion.Firstly,in order to reduce the computational complexity of KSSA,starting from a few labeled seed members,BFS extension technique is used to obtain the local sam-pled subgraph that contains the seed set,so that the sampled subgraph could cover all nodes of the target community which the seed set belongs to as much as possible.Secondly,based on spectral graph theory,we adopt Krylov subspace to locally approx-imate the eigen subspace on the sampled subgraph,and seek a sparse indicator vector supported by the seeds via minimizing an one-norm over the Krylov subspace.The element of the sparse indicator vector denotes the probability that the corresponding node belongs to the target community.Finally,local community truncation is carried out on nodes corresponding to the elements of the sparse indicator vector.Due to the overlapping phenomenon in the expansion process of different seed sets,KSSA can be used to detect the overlapping community structure.Extensive experimental results show that the community detection accuracy of KSSA is higher than that of several popular benchmark algorithms on synthetic datasets and large real-world networks.(3)For the local community detection problem in large complex network,we pro-pose a novel method called Lanczos Iteration Spectral Approximation(LISA)for local community detection.LISA includes three key steps:heat kernel sampling,Lanczos iteration spectral approximation and local community boundary truncation.Firstly,in order to reduce the computational complexity of LISA,starting from a few labeled seed members,we adopt fast heat kernel diffusion technique on the first try to sample the local subgraph of the seed set,so that the sampled subgraph could cover all nodes of the target community which the seed set belongs to as much as possible.Secondly,based on the Rayleigh quotient minimization,a fast Lanczos iteration method is de-signed on the sampled subgraph to locally approximate the eigenvector of transition matrix with the largest eigenvalue,and the convergence analysis of Lanczos iteration method is given.Finally,the nodes corresponding to elements of the local eigenvec-tor obtained by Lanczos iteration approximation are truncated so as to obtain the community with local minimal conductance.Due to the overlapping phenomenon in the expansion process of different seed sets,LISA can be used to detect the overlap-ping community structure.Extensive experimental results show that the community detection accuracy of LISA is higher than that of several benchmark algorithms for comparison on synthetic datasets and large real-world networks.To sum up,we design a global overlapping community detection algorithm called QOCE and local community detection algorithms named KSSA and LISA for large complex network,respectively.Extensive experimental results show that the commu-nity detection accuracy of the three algorithms are higher than that of the benchmark algorithms on real-world networks as well as synthetic datasets. |