| With the rapid development of the Internet,network analysis has gradually become a research hotspot.Community detection is an important research topic in network research.Generally,communities have the characteristics of close connections in one community and sparse connections between communities.In recent years,a plenty of excellent community detection algorithms have been proposed and widely used in many fields.Due to the increasing size of the network,parallel computing has been introduced by many algorithms.However,most of these algorithms are static,thus difficult to perform network evolution analysis.Most of the dynamic community detection algorithms are performed on single-machine,which are inefficient and difficult to analyze large-scale networks.Therefore,it is necessary to combine dynamic algorithms with parallel computing.According to the overlapping of structure,the community detection research can be divided into non-overlapping community detection and overlapping community detection.Just as the name implies,in the non-overlapping community structure,one vertex has only one community ownership and overlapping communities allow one vertex belongs to multiple communities.Based on Spark,this paper designs and implements two parallel dynamic network community detection algorithms and a parallel graph computing system.The main contributions as follows:Based on the limitations of existing community quality metric,this thesis proposes a novel parallel community quality metric PWCC.This metric is very sensitive to structural changes and ensures the accuracy of the community structure.In this thesis,based on incremental computing,a parallel dynamic non-overlapping community discovery algorithm(PICD)is proposed.The incremental clustering process can be mainly divided into two steps:firstly,performing parallel global search to find incremental vertices,and then and then performing parallel only on incremental nodes.Local community ownership adjustment,making full use of the short-term smoothness characteristics of the network,and continuously optimizing the PWCC of the network to obtain a high-quality community structure.Based on the community structure of PICD,this thesis continues to propose a parallel dynamic overlapping community discovery algorithm PIOCD,which is also based on PWCC optimization to discover high-quality overlapping community structures in the network.PIOCD extends the vertices’ community ownership adjustment strategy,allowing vertices to replicate to their neighboring communities,which ensures the overlapping of community structure.In this thesis,a lot of experiments are carried out on both artificial networks and real world networks.The experimental results show that the two algorithms have higher accuracy and stability,they can lead to more accurate community structures and achieve higher efficiency.In terms of time performance,as the scale of network increases,the algorithm exhibits almost linear time growth.At last this thesis designs and implements a parallel graph computing system.Based on Spark,it implements multiple network metrics analysis and community detection algorithms,and integrates them into the system as components.The system supports network import,network metrics analysis,static community detection,dynamic community detection and network display,making it easy for users to analyze and calculate large-scale networks. |