Font Size: a A A

Research On Scalable Association-oriented Partitioning Approach For Streaming Graphs

Posted on:2016-07-29Degree:MasterType:Thesis
Country:ChinaCandidate:Y HaoFull Text:PDF
GTID:2348330479953395Subject:Computer system architecture
Abstract/Summary:PDF Full Text Request
Graph is a relatively complex data structure. Nowadays graphs including social networks, biology network etc. are increasing rapidly. Many scenarios require the completed information of the graph, and when a graph does not physically fit on one machine, maintaining a coherent view of the entire state is impossible. Existing graph partitioning approaches not only incur high computation and communication costs for large graphs, but require full knowledge of the graph. Obviously, extracting knowledge by performing computations on graphs is becoming increasingly challenging as graphs grow in size. For example, new accounts are created and deleted every day in online social netwoks such as Facebook. Futhermore, heterogeneous graphs created upon post-processing datasets such as tweets,figures are also dynamic. Graphs are no longer static and unchangeable. It is crucial to have efficient graph partitioners of dynamic graphs.How to process dynamic graphs efficiently is now becoming a research hotspot in the area of big graph data. The balanced graph partitioning problem in the dynamic setting is known as streaming graph partitioning. Vertices or edges arrive and the decision of the placement of each vertex or edge has to be done “on-the-fly” in order to incur as little computational overhead as possible. Streaming graphs pose significant challenges at all levels, from partitioning clustering algorithms to distribution strategies. In order to process these large streaming graphs which are generated continuously, we propose an association-oriented streaming graph partitioning approach named Assc. The approach first computes the rank values of vertices with a hybrid approximate PageRank algorithm. It then clusters the vertices according to the rank values of the verices and their associations using an association clustering algorithm.Finally, comparing performance of Assc to Hash and METIS, a fast, off-line approach. Showing on a large collection of graph datasets with Assc can partition the streaming graphs with hundreds of millions of vertices that our approach outperforms Hash and METIS. Streaming partitioning approach Assc is not intended to substitute for a full information graph partitioning. And streaming partitioning algorithms Assc can be viewed as a preprocessing optimization step for graph processing.
Keywords/Search Tags:Streaming Graph, Association Partitioning, Clustering, PageRank
PDF Full Text Request
Related items