Font Size: a A A

Research On Incremental Iterative Model For Large-scale Dynamic Graph Data Computation

Posted on:2021-06-07Degree:MasterType:Thesis
Country:ChinaCandidate:M S HeFull Text:PDF
GTID:2480306122468624Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
The rapid development of online social networks,communication networks,sensor Networks,ect.has spawned a large number of quickly changing network datasets.Because graphs can capture complex dependencies and interactions in network data,network data can be naturally represented as a graph.The essential property of graphs is that they are usually dynamic.Many applications must take advantage of the latest graph to produce results that reflect their current state.However,the traditional computation method needs to be re-run on the entire data set,which has the problems of low efficiency and waste of computing resources.Incremental computation is an effective method to improve the processing efficiency of large-scale dynamic graphs.In addition,most graph algorithms(such as Page Rank)use iterative calculations to update the state of vertices until the given convergence conditions are met.However,the traditional iterative graph algorithm in the iterative process of the next round of iterative calculation depends on all the results of the previous round of iterative calculation.In fact,not every vertex in the result set produced by the previous round of iterative calculation needs to participate in the next round of iterations,because the state of most graph vertices will converge in advance in the iteration,and need not be processed in subsequent iterations.The inconsistent computing behavior exhibited by this iteration will lead to many redundant calculations.In order to solve the problems of low calculation efficiency and redundant calculation in dynamic graphs,this paper proposes a model called Inc Graph to support incremental iterative computation over dynamic graphs.Different from the way of traditional iteration,Inc Graph executes the graph algorithm through reusing the results of the previous graph and performs computation on the part of the graph that has changed.The contributions of Inc Graph include:(1)an incremental iterative computation model that consists of three steps: the first step is a pre-processing step,which proposes a graph search algorithm to obtain the changed vertices after the graph is updated;the second step is an incremental computation step,which is used to calculate the results of the changed vertices of the current graph;the third step is a merge computation step,which to calculate the results on the entire graph by using the results of the previous graph and the incremental step;(2)a fast incremental update method,which is based on the idea of asynchronous iteration,uses the previously computed vertex states of the current iteration to update the un-computed vertex states immediately,to optimize the iterative process within the iterative graph algorithm;and(3)a propagation control method,which filters the vertex data in the inactive state in the iteration process,so that the scale of the data to be processed in the subsequent iteration process will gradually reduce,so as to achieve fast convergence of the graph algorithm.We implement Inc Graph model based on Spark GraphX and evaluate its performance by using several representative iterative graph algorithms: Page Rank,Connected components,and Single Source Shortest Path.The results show that compared with the traditional iteration,when adding the 100 k of vertices in different size data sets,the performance optimization ratio of Inc Graph is 41.21% averagely,and 53.76% maximum;and when the percentage of added vertices varied from 0.01% to 10% in different data sets,the performance optimization ratio of Inc Graph varied from 22.87% to 70.1%.Moreover,the result error of Inc Graph is small and can be neglected.
Keywords/Search Tags:Spark GraphX, graph algorithm, dynamic graph, incremental iterative computation
PDF Full Text Request
Related items