| Communities(modules or clusters)are a widespread structural characteristic in many complex networks.Finding communities in networks is to divide the similar nodes into a cluster,where the interaction between the nodes within the cluster is stronger than the interaction between clusters,which is usually considered that edges between vertices in the same community are dense while intercommunity edges are sparse.In recent years,community detection has been widely used in many different types of networks,such as the World Wide Web,social networks and biological networks.The analysis of communities assists users to understand the real complex networks more clearly and also can be exploited for many social media mining tasks,such as business recommendation and user management.Although a large body of community detection methods was proposed in recent years,there are still some problems need to be studied,such as in the big data environment,how to effectively and accurately detect the community structure in complex networks;How to apply the results of community discovery to practical problems.To solve the problems of the existing community discovery algorithms,this paper proposes a number of algorithms to detect the non-overlapping communities and overlapping communities in complex networks.Among them,the network embedding method can map the static network structure to a vector representation of nodes,and the label propagation method can capture the dynamic propagation characteristics of the network.The main contributions are as follows:(1)The network embedding method for static structure feature learningThis paper proposed a novel network embedding model based on loop sampling,which maps each vertex of a network to a fixed-length,low-dimensional vector.Then,the k-means method is used to divide the nodes into separate communities according to the learned feature vector.On the basis,this paper puts forward a network embedding model based on the edge feature in an edge-centric view.Experimental results show that this method can be applied to large scale networks and get effective results on several real-world networks.(2)The label propagation method for dynamic propagation characteristicsThis paper investigated the fact that the distribution of node influence is always unbalanced in complex networks,and then we proposed a novel overlapping community detection algorithm called Label-Propagation-Probability-Based(LPPB)algorithm.Probability of label propagation depends on the structural propagation characteristic of complex networks and properties of the nodes during the process of propagation.Experimental results on benchmark datasets and C-DBLP network illustrate that LPPB is accurate and stable for overlapping community detection.This paper investigated several label propagation methods and proposed a multiple label propagation strategy for community detection,called MLPS algorithm.This approach combines the similarity propagation and the influence propagation methods to guide the propagation of labels between nodes.Experimental results on synthetic datasets and real networks illustrate that MLPS has both high accuracy and modularity at the same time.In order to solve the problem of finding class center of clustering methods,this paper proposed a novel core leader based label propagation algorithm for community detection called CLBLPA.Firstly,we find core leaders of the potential community by using a greedy method.Then we utilize the label influence potential to guide the process of label propagation.Thus we can accelerate the convergence of algorithm and improve the stability of the output.Experimental results on synthetic datasets and real networks show that CLBLPA can significantly improve the quality of the output communities.(3)Fusion of the network embedded method and the label propagation methodBased on the characteristics of the network embedding method and the label propagation algorithm,this paper combines the two into a comprehensive community discovery framework,which can use the structure and dynamic propagation characteristics of the network to excavate the communities at the same time.The fusion method transforms the node feature vector output by the network embedding method into the distance measure of nodes in the label propagation method,and then the distance measure of nodes can guide the propagation process of the labels.Finally,the community quality of the fusion method is better than the network embedded and label propagation methods.(4)Research on the overlapping community structure and the structural holesIn this paper,the overlapping structure of the communities is studied in detail.The dynamic analysis of overlapping structure can reveal the behavior characteristic of the community overlapping nodes and the dynamic development trend of the network.To find the structural holes of the communities,an algorithm named Structural Holes between Communities Detection Algorithm(SHCDA)is presented.Experimental results on different datasets show that SHCDA gets the best accuracy compared with other baselines and the structure holes have an important effect on the network information diffusion. |