Font Size: a A A

The Parallel Streaming Sampling Algorithm For Dynamic Graph Based On Collaborative-induced Edge Sampling

Posted on:2017-05-26Degree:MasterType:Thesis
Country:ChinaCandidate:J YuFull Text:PDF
GTID:2348330503989803Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
Graph data is produced continuesly and largely by applications such as social network services, scientific computing simulation and so on. It's very important to carry out a fast and effective analysis. Taking a representative subgraph from the original graph can save computational resources and improve process efficiency for some applications. Dynamic graph is changing all the time so that the static information of the whole graph cannot be accessed;thus sampled by streaming sampling method. Due to the cumulative iterative characteristic, streaming sampling process must be serial. When the scale of subgraph is large, sampling process is too slow to ensure realtime; while small subgraph cannot guarantee the similarity with the original graph. The existed parallel sampling algorithm is not suitable for dynamic graph; thus a parallel streaming sampling algorithm need to be proposed.Based on the analysis of the typical streaming sampling algorithm PIES(Partial-Induce edge sampling) and the parallelization scheme of PIES. A parallel streaming sampling method based on Collaborative-Induced edge sampling called PaStS is proposed. PaStS adopts a global vertex set to maintain the latest sampled vertices. When sampling parallelly, all vertices information is synchronized, so as to adjust the sampling vertex set sizes dynamically, and implement collaborative-induced edge sampling. PaStS solves the problem that the vertices and edges of subgraph reduce massively when the streaming sampling process is parallelized.In this research, three real-world graph datasets and one generated dataset are used to evaluate the efficacy of our proposed method, compared with PIES and its improved algorithms. According to experimental result, our method can accelerate the speed of sampling process; the sampling efficiency can be increased by 15 to 49 times when the parallel degree is 8. PaStS can preserve major graph feature better than current streaming sampling algorithms in most cases, and maintain the stability of features between original graph and subgraph when the dynamic graph is rapidly changing.
Keywords/Search Tags:Dynamic graph, Streaming graph, Graph sampling, Parallel sampling
PDF Full Text Request
Related items