Font Size: a A A

An Empirical Study On Recent Graph Databases And Graph Computing Systems

Posted on:2022-05-14Degree:MasterType:Thesis
Country:ChinaCandidate:R WangFull Text:PDF
GTID:2518306479493394Subject:Software engineering
Abstract/Summary:PDF Full Text Request
With the explosive growth of data size and the complexity of data structures in reallife application scenarios,graph data is widely used to model the intricate relationships among various objects in a diversity of applications.The advance in graph data processing and management has brought significant values to the areas of artificial intelligence,knowledge engineering,etc.As the fundamental infrastructure in managing and analysing graph data,graph management systems have received significant attention from both industrial and academic communities.Two types of graph management systems are developed: graph database and graph computing system.In recent years,there also emerge lots of newgeneration and promising graph database and graph computing system products.Howerver,most existing benchmark works only focus on a few popular systems.There lacks a benchmark work integrating the newgeneration products,resulting in absence of reliable basis for users to make proper selections.To fill this gap,this paper firstly overviews the characteristics of graph databases and graph computing systems theoretically,and then,based on the LDBC(Linked Data Benchmark Council)benchmark,empirically evaluates the performence of several newgeneration graph databases and graph computing systems separately,and summarizes key suggestions about how to select newgeneration graph management systems under different scenarios.In the part of theoretical analysis,this paper first gives an overview of the characteristics of popular and newgeneration graph databases and graph computing systems,and then,selects four graph databases Neo4 j,Agens Graph,Tiger Graph,Light Graph and four graph computing systems Power Graph,Pregel+,Gemini,Flash as representatives to conduct further investigation.In the part of experimental research,to fully and fairly evaluate graph databases and graph computing systems,this paper chooses LDBC benchmark and extends it to support more complex scenarios.Based on this benchmark,we model and construct the unified evaluation framework for automatically testing graph databases and graph computing systems separately.Our empirical studies for graph databases are conducted in a singlemachine environment against the LDBC social network benchmark.We compare three newgeneration graph databases Agens Graph,Tiger Graph,Light Graph with the most popular database Neo4 j by evaluating data bulk importing and processing micro and macro queries under three different scale datasets,respectively.Our empirical studies for graph computing systems are conducted in a cluster environment on the extended LDBC graphalytics benchmark.We compare three newgeneration graph computing systems Pregel+,Gemini,Flash with the popular system Power Graph by executing several popular graph algorithms under five datasets with different scale and structures.We also analyse the scalability,simplicity and generality of the four systems.Experimental results show that the performance of every evaluated system is different and no one can perform best in all cases.According to the results,this paper concludes and provides the selection suggestions of newgeneration graph databases and graph computing systems under different application scenarios.This is the first empirical study to compare these newgeneration graph databases and graph computing systems under the unified environment,filling the blank in the field of graph management system benchmark.
Keywords/Search Tags:Graph Databases, Graph Computing Systems, LDBC Benchmark, Performance Evaluation
PDF Full Text Request
Related items