Font Size: a A A

Research On Storage System Of Large Scale Dynamic Evolving Graph

Posted on:2018-04-22Degree:MasterType:Thesis
Country:ChinaCandidate:Y YuFull Text:PDF
GTID:2348330521450950Subject:Computer system architecture
Abstract/Summary:PDF Full Text Request
In the field of graph data mining,the value of mining and analysis for dynamic evolving graph with temporal characteristics is getting more and more attention.The primary task for data mining is to obtain and store data.However,existing graph storage and query platforms are designed for the static graph,and cannot record the timing evolutionary process of dynamic graph.Therefore,this thesis studies the optimized storage model and retrieval method of the evolution graph,and designs an online storage system that can support the large scale dynamic graph.It can capture the whole evolution of the graph in near real time and provide good query capability.The main work of the thesis is as follows:Based on the basic theory of dynamic evolution graph and its classical representation model,a storage model with integrated redundancy and query performance is designed based on log file model and adjacency list.Aiming at the performance bottleneck problem of the model in the reconstruction phase of a snapshot,an optimized reconstruction method based on log digest is proposed.At the same time,an optimized retrieval method based on reconstruction of local query view is proposed for local query requests prevalent in online social network platform.The method is based on the 1 degree Ego network to construct the query view,as much as possible to reduce the size of the retrieval snapshot,which indirectly improves the response performance of such queries.In order to obtain the evolution of some high-speed update networks in near real-time,a parallel update algorithm based on hash packet is designed to speed up the graph update rate.The algorithm guarantees the consistency of the graph update process by limiting the parallel execution of removing-vertex transaction and adding-edge transaction,and strictly restricting the order of the real-time graph query transaction and the graph update transaction to ensure the correctness of the query result.Based on the above theoretical research,this thesis designs and implements an online storage system——Dynamic Graph DB,which can store a large-scale dynamic evolving graph.At the same time,LRU algorithm is used to solve the cache overflow issue when thesystem runs online.In order to verify the correctness and effectiveness of theoretical research and system design,the real Twitter and DBLP datasets are used to simulate the evolving networks in the real world to evaluate the correctness of graph storage and query,the ratio of storage redundancy and query efficiency,throughput of update operation,the size and query latency of partial views.The results show that the optimized storage model can not only store and query the evolving graph correctly,but also has better comprehensive advantages in storage efficiency and query performance than the implementation of the normal Log-File model.After optimized by the parallel updating algorithm based on the hash block,the system can achieve better throughput than the sequential updating strategy in single thread on multi-core computers.In addition,aiming at the local query requests which are widely available for online social network platforms,the snapshot retrieval strategy based on reconstructing the partial query view can effectively reduce the scale of query view,and indirectly improve the response performance for such queries.
Keywords/Search Tags:dynamic evolving graph, temporal graph, Log-File model, graph storage, online storage system
PDF Full Text Request
Related items