| In the era of big data,the underlying data models for many problems are the large-scale dynamic graphs,whose numbers of vertices and/or edges are over tens or even hundreds of million,and the vertices or edges are changing in real time.For example,social network graphs,Web pages linkage graph,etc.In order to efficiently perform distributed/parallel computing of large-scale dynamic graphs,large-scale dynamic graphs must be balanced and rationally divided into k partitions,which is the key basis and premise of efficient large-scale dynamic graph distributed/parallel computing.In general,an efficient large-scale dynamic graph partitioning algorithm must satisfy the following requirements: 1)The communication cost of the graph partition between inter-partitions is as small as possible;2)The sizes of all graph partitions should be balance;3)As efficient partitioning algorithm for the largescale dynamic graph is the key and premise of the graph distributed/parallel computing,the algorithm should be efficient and effective.Due to the complexity of the large-scale dynamic graph partitioning problem and dramatic needs for the large-scale dynamic graph distributed/parallel computing,it has been an NP-hard problem and needed to be solved urgently.For the efficient large-scale dynamic graph partitioning problem,a few researchers and technicians have begun to focus on it in recent years.Though some novel algorithms have be proposed,the following drawbacks of existing algorithms are prevalence: 1)The algorithms’ time performance has a large room to be improved;2)The algorithms can’t adapt to real-time changes of vertices or edges in large-scale dynamic graphs;3)The existing algorithms only focus on the size’s balance of vertices or edges on each partition regardless of the size’s balance of active vertices or edges in the process of graph calculation;4)The time and space costs are often too high for auxiliary computation.Therefore,the innovative research on and implementation of an efficient large-scale dynamic graph partitioning algorithm have very great theoretical and practical significance.The thesis stems from the National Natural Science Foundation of China for innovatively designing and developing an efficient large-scale dynamic graph processing framework and the graph partitioning algorithm.Aiming at overcoming the weaknesses of existing algorithms,firstly,on the basis of static graph process framework Graph X,with introducing some novel operations to the vertices or edges of the dynamic graph,an efficient largescale dynamic graph processing framework—Dyn Graph X is proposed.Then,based on Dyn Graph X,an efficient and effective distributed large-scale dynamic graph distributed partitioning algorithm—CBH-DGP(Centrality-based Hash for Dynamic Graph Partition)is presented.In one word,the main work and innovation of the thesis are as follows:1)For efficient and effective identifying those active high-central vertices in a large-scale dynamic graph,a new vertex centrality measurement strategy is proposed.Where vertex centrality describes the importance degree of a vertex in a network/graph.2)In order to avoid the calculation imbalance of large-scale dynamic graph,a new edgehashing strategy based on the vertex centrality measurement strategy is proposed,which can uniformly hash the active center vertices to k partitions of a large-scale dynamic graph.3)Considering no existence of a dynamic graph processing framework now,some new operations for active vertices or edges in a dynamic graph are introduced to the static graph processing framework Graph X resulting in a new efficient large-scale dynamic graph processing framework–Dyn Graph X coming into being.Moreover,on the basis,a distributed two-stage incremental large-scale dynamic graph partitioning CBH-DGP algorithm is proposed and designed.4)On a large number of large-scale dynamic graph datasets,the proposed large-scale dynamic graph processing framework Dyn Graph X and our algorithm CBH-DGP are fully tested and verified on the datasets.The experimental results show that our Dyn Graph X is correct and feasible,and that our CBH-DGP algorithm is much better than the-state-of-art large-scale dynamic graph partitioning algorithms in efficiency and accuracy.As the problems of the directed graph’s topological structure features and the relationship between the low-central vertices have no consideration in our presented framework Dyn Graph X and algorithm CBH-DGP at present,how to overcome these problems is our future research directions. |