Font Size: a A A

Research On Parallel Recognition Algorithm For Cluster Structure Of Complex Networks Based On Multi-core CPU

Posted on:2010-07-05Degree:MasterType:Thesis
Country:ChinaCandidate:L L GaoFull Text:PDF
GTID:2178360272495756Subject:Computer application technology
Abstract/Summary:PDF Full Text Request
Society, technology and information systems are often expressed as some form of complex network systems, Such as social networks (friends and co-operation network of networks, etc.), technology networks (Internet, World Wide Web as well as the power grid, etc.), biological networks (neural networks, metabolic networks, as well as the food chain networks).The important feature shown in Complex networks is cluster structure. Cluster has also been referred to as modules, community, class, group, group. In reality, the cluster structure of the network has a specific meaning. The cluster of the World Wide Web networks has similar subject. The cluster of citation network has relevant subject. The cluster of Biochemical networks has also some analogous functional units. Unfolding the cluster structure of these networks make us understand and develop these networks effectively. According to solving strategy of algorithms, complex network clustering algorithms can be divided into clustering algorithms based on Optimise and clustering algorithms based on heuristic. FN algorithm, CNM algorithm and Vincent algorithm are all the clustering algorithms based on optimizing modularity. These algorithms can identify the cluster structure in large networks. With the improvement of hardware performance, multi-core CPU becomes popular, but current algorithm can hardly extend in the multi-core CPU, so it is necessary to study the parallel clustering algorithms.In this paper, we study CNM algorithm and Vincent algorithm, analysising the performance and efficiency of these algorithms through the experiment, improving Vincent algorithm into a parallel algorithm, making Vincent algorithm running in multi-core CPU. This paper collects some common complex network data, employs the CNM algorithm and parallel Vincent algorithm to cluster these complex network data and analyze the statistical properties of these cluster. At the basis of others'previous work, analyzing degree distribution of these complex network data. Specifically: Comparing the CNM algorithm and Vincent algorithm:These algorithm are both the clustering algorithms based on optimizing modularity. This paper implement these algorithm using c++ language, compare the efficiency of these algorithms with some common complex network data. The running time of Vincent algorithm is much less than the CNM algorithm's, some complex network data's running time with the Vincent algorithm is half of the CNM algorithm's. CNM algorithm also required a lot of memory, while using the computy with 2G memory and the node number more than one million, the procedure that implemts the CNM algorithm collapse caused by Memory overflow. And modularity modularity obtained by Vincent algorithm is slightly higher than that obtained CNM algorithm. so Vincent algorithm is appropriate to find community structure in very large networks.Research on parallel Vincent Research:With the improvement of hardware performance, multi-core CPU becomes popular, running the procedure in a single CPU has reached bottleneck. Vincent algorithm is appropriate to find community structure in very large networks and this algorithm can also be improved a parallel algorithm. This paper process the second phase of Vincent algorithm, and make this phase run in parallel. In the second phase of original algorithm, we evenly divide the nodes of complex networks into m segments(m is the number of parallel),and the distance of the nodes in the m segments is more than 2,then then we create m threads to compute these m segments. The first method to divide the nodes of complex networks into m segments is clustering, clustering the complex networks to m segments and the node in the m segments has less edge to connected each other. the second method to divide the nodes of complex networks into m segments is traveling in breadth-first order. The second method is relatively simple and relatively easy to implement. For before Parallel algorithm running, it need to divide the nodes of complex networks into m segments, and the third phase of the algorithm cannot run in parallel, so the running time cannot reduce in double with increasing the number the CPU. Experiments show that the time using two CPUs decrease by 33% at least than that using single CPU, and decrease by 60% to the least extent than that using single CPU. In addition the modularity do not drop, so parallel Vincent algorithm can decrease the running time with the increasing the CPU number.Collect and statistical analysis the complex networks data: this paper collect some common complex network data, compute their average path length and average clustering coefficient, the average path length of social complex network is less than 6,so it verify the "small world " property of the complex network. this paper analyze degree distribution of these complex network data, finding that the degree distribution of road network is lineal and the degree distribution of the other network data is power laws or exponentials.Statistical analyze the number of the cluster.:This paper employ the parallel Vincent algorithm to cluster the large complex network, analyze the statistical property of the number of cluster. Experiments show that the number of cluster of the road complex network increase lineally, the number of cluster of other complex network increase exponentially. At the same time, cumulative distributions of number of cluster most complex network is power laws. Analysis of statistical properties is concerning with data and algorithm, so CNM algorithm is used to analysis these data and get the same result in this paper.The Parallel Vincent proposed in this article has a definitely practical application value that it can identify the cluster structure of large Complex networks rapidly through hardware capability .however, there is a number of deficiencies which can be improved, such as implementing the same function by using distributed algorithm,allocating calculation assignment to a vacant computer or assigning the calculation to GPU in order to utilize the hardware more efficiently since mostly the GPU is available. The research on the statistic speciality of cluster scale is based on module optimization. Further researches focus on other kinds of algorithm to analyze the above one: whether the results is similar or not.
Keywords/Search Tags:Complex Network, Cluster Structure, Parallel Algorithm, Statistical Analysis
PDF Full Text Request
Related items