Font Size: a A A

Design And Implementation Of Critical Technologies For A Distributed Graph Storage System

Posted on:2021-02-28Degree:MasterType:Thesis
Country:ChinaCandidate:W MaoFull Text:PDF
GTID:2428330620964190Subject:Engineering
Abstract/Summary:PDF Full Text Request
In recent years,the processing of ultra-large-scale complex graphs and social graphs has become a hot issue in the Internet industry.Compared with the traditional data processing,in addition to the large scale,data often appear as a logical graph structure,and data processing is represented by continuous iteration of points and edges.Traditional relational databases have limited support for large-scale graph data in terms of storage and processing.This thesis mainly discusses about the key issues in the storage of distributed graph databases.In the distributed graph storage systems,graph storage strategies and partitioning algorithms play an important role in the entire system.An excellent partitioning algorithm can preserve the structure of the graph to the greatest extent,reduce the size of the cutting edges,and ensure the proximity of the sub-partitions,which can fundamentally reduce the network overhead caused by processing data and improve the speed of system response.The storage strategy is responsible for building a storage model for the divided graphs,reliably storing the data,and providing strong support for querying and writing.This article focuses on three aspects.Firstly is to improve the execution efficiency of the data partition algorithm and ensure a certain partition effect.Next is to design a data model for the characteristics of graph data.Finally is to design and implement a reliable distributed storage for graph data.The work done in this article is mainly as follows:1)Firstly,we introduce the generation and development of graph database,then analyze the current graph database's requirements for the underlying graph storage system,and introduce the classification and research status of graph partitioning algorithms.Finally,the needed related technologies to implement the distributed graph storage system are deeply researched.2)This thesis studies the current mainstream partitioning algorithms for graph data and analyzes their advantages and disadvantages.We optimize the partition efficiency of the HDRF partition algorithm in parallel execution scenarios,and ensure a certain partition performance.3)We investigate the existing mainstream graph storage schemes and propose a reliable distributed graph storage scheme to ensure better read and write performance and high availability.This graph storage system designs proprietary data model and storage model,and a non-stop index update mechanism,and range query functions for graph database queries.We also design a special traversal mechanism in our system.At the same time,it provides multiple copies of data and consistency guarantee,and can automatically adjust to the load pressure of the cluster when the data node changes.In this thesis,detailed functional and performance tests are performed on the distributed graph storage system.Tests show that the optimized graph partition algorithm greatly improves the partition efficiency and guarantees a certain partition performance;the distributed graph storage system has a certain throughput and response capability.
Keywords/Search Tags:distributed graph storage, parallel partition algorithm, graph partition algorithm optimization, key-value
PDF Full Text Request
Related items