Design And Implementation Of Critical Technologies In Distributed Graph Database

Posted on:2021-02-24

Degree:Master

Type:Thesis

Country:China

Candidate:Z Y Li

Full Text:PDF

GTID:2428330623968542

Subject:Engineering

Abstract/Summary:

PDF Full Text Request

The current Internet system is generating massive amounts of unstructured and interconnected data.These data often do not have a unified format due to the continuous evolution of the system,and the types of objects are ever-changing,and the relationships between objects are complex and changeable.Graph database,as a new type of database specifically designed to store graph data,can be well adapted to this data application scenario.It is a current research hotspot.The current difficulties and bottlenecks in the development of graph databases are mainly concentrated in two aspects.First,how to design a database based on native graph storage,instead of simply encapsulating semantics on the basis of other types of database storage engines.If the proximity and relational query of graph data cannot be considered in the storage layer,its performance will not necessarily be substantially improved.Second,how to provide a computing layer with rich graph query capabilities,which can provide users with the ability to query the database based on the idea of graphs.Other types of query languages are inherently deficient in the expression of graph data queries,and are not sufficient for querying graph databases.This article proposes a high-performance distributed graph database architecture to try to solve the following problems:1.How to efficiently index points and edges based on the proximity of graph data in a distributed scenario.In this thesis,a Hash-based data distribution algorithm is used to divide the graph into multiple shards,and the index of points and edges is accelerated by means of a specially designed data storage format and physically selected similar location storage methods.2.How to implement a high-performance and scalable(deployment and implementation)computing layer framework in a distributed scenario.This thesis uses an additional abstraction layer and table-based data abstraction to ensure that all operators have a common event-based high-performance asynchronous programming environment to simplify their development and design.3.How to design and implement operators in the computation layer so that they can efficiently express and execute Cypher graph query language.This thesis proposes a dedicated algorithm for generating logical execution plans and physical execution plans,as well as an operator scheduling algorithm in a distributed system,so that two-level relational queries of millions of nodes can be completed in 100 ms.This thesis also gives the detailed implementation details and test report of the calculation layer in the distributed graph database architecture,but the implementation of the storage layer is not given due to space and project division.The detailed implementation of the storage layer will be given in the form of other thesis.

Keywords/Search Tags:

graph database, database, distributed system, non-relational database, high performance

PDF Full Text Request

Related items

1	Design And Implementation Of Integrated Query Middleware About Relational Database And Non-Relational Database For Structure Safety Monitoring
2	Research On Migration Algorithm From Traditional Relational Database To Non Relational Database
3	Distributed Data Process In Graph Database
4	Study Of Erlang-based Key-Value Database
5	Based On Historical Query Relational Database Query Optimization Keyword Research Questions
6	The Design And Implementation Of Distributed Relational Database Based On KVM Cloud Computing Platform
7	Research On Application Of Relational And Non-Relational Database
8	Performance Assessment Of EMR Systems Based On The Database Cache
9	Design And Realization Of Storage Subsystem Of Distributed Relational Database
10	Reduce The Data Model Of The Distributed Database Concurrency Conflict