Font Size: a A A

Research And Implementation Of Accelerating For Graph Databases

Posted on:2020-11-01Degree:MasterType:Thesis
Country:ChinaCandidate:H B HuFull Text:PDF
GTID:2428330596976770Subject:Software engineering
Abstract/Summary:PDF Full Text Request
Graph database is a new type of database which breaks through the traditional database and stores data in the form of graph.It can represent a new generation of database with nodes,edges and attributes.Since 2008,graph databases have attracted more and more attention from developers and scholars in different fields.However,due to the short development time of graph database and the fact that relational database occupies most of the development resources,the technological development and theoretical research of graph database is not smooth.In addition,with the development of research,some drawbacks of graph database itself are gradually exposed.If the graph database is directly introduced into the current development environment,not only can the excellent performance of the graph database not be fully developed,but also the drawbacks of the graph database may be magnified due to improper use,which will in turn damage the performance of the current system.In order to solve these problems,graph database is combined with the current popular hybrid storage strategy pattern in the thesis,the caching pattern with graph characteristics of graph database is effectively optimized,and experiments of graph database decomposition using middleware are carried out.Firstly,the design of HDD + SSD hybrid storage is introduced in graph database in this thesis,which is very suitable for data separation storage.By utilizing the characteristics of graph data stored in graph database,the related performance of graph database under mixed storage strategy is improved.At the same time,in order to make full use of the storage mechanism of SSD,an additional "tree decomposition" mechanism is introduced into the storage management mechanism of graph database.By using tree decomposition,data stored in graph database can be symmetrically written into SSD devices in a more compact way,thus reducing the fragmentation rate and efficiency of SSD storage space.Aiming at the problem that graph database occupies large memory capacity,FlashGraph framework is introduced in the thesis to further utilize SSD storage space and reduce memory.In addition,in view of the serious wear and tear of SSD in terms of service life,the graph data is separated by hot and cold according to the adjacency relationship of the graph data and the historical processing information in the thesis to realize the combination of multiple erasure into one erasure,so as to improve the service life of SSD.In addition to using the new storage strategy to reduce the memory occupation,another important method to speed up the database query is to develop the cache system,several key factors affecting the graph database cache system are discussed and the ID cache system based on the characteristics of the graph database is proposed in the thesis.In view of the consistency of global ID of graph database,cache ID and corresponding statements in memory for query acceleration.In view of the unique three-tier structure of graph data,three-tier query is introduced.In addition,full-byte matching of complex statements is added to deal with other cases of graph database query.In terms of consistency,the ID correspondence mechanism is introduced to ensure strong consistency of data.In terms of the permutation problem,the double judgment of the history information and the time characteristic of the graph are used for the cache permutation.Finally,a middleware-based graph database distributed technology is introduced in the thesis in view of the insufficiency of the graph database distributed technology and the difficulty of changing the existing graph database.This middleware is based on the principle of sentence analysis and routing forwarding.It can integrate several existing graph databases,realize physical separation and logical collective.In the case of large graph data segmentation,the hybrid segmentation method is adopted to deal with large graph data in all cases,i.e.horizontal segmentation and horizontal segmentation together.In the cross-database query problem of distributed graph database,the middleware storage graph connection information is proposed,which has achieved data cross-database tracking,and finally integrates multiple groups of data in the middleware.
Keywords/Search Tags:Graph database, hybrid storage strategy, tree decomposition, caching technology, distributed middleware
PDF Full Text Request
Related items