Font Size: a A A

Design And Implementation Of Distributed Transactions Based On Distributed Graph Database

Posted on:2022-10-13Degree:MasterType:Thesis
Country:ChinaCandidate:Y Z GuoFull Text:PDF
GTID:2518306524489664Subject:Master of Engineering
Abstract/Summary:PDF Full Text Request
With the growing popularity of knowledge graphs,companies are scrambling to store their data on graph database engines,and at the same time,analysis and mining of data are becoming important.However,data mining requires cross-analysis using a large number of different categories of data,which involves data maintained by different teams,often not on the same server,or even using different databases to store these heterogeneous data.If these data are mined,they have to be integrated for analysis.However,there are few well-established transaction frameworks that can support such cross-database transactions.In the traditional framework of distributed transactions,the two-phase commit protocol is the classic implementation.However,this scheme requires locking of the resources used and even has an impact on the horizontal scaling of the service.Then how to implement a distributed transaction system based on multiple heterogeneous data sources is the goal of this thesis.The scheme in this thesis is mainly based on e Bay's GRIT transaction model.The following work is accomplished:1.The distributed transaction module is designed and implemented,based on the GRIT protocol,with corresponding optimizations for graph databases,reducing the granularity of locks.Combining the ideas of optimistic concurrency control and deterministic transactions,the commit process of transactions is removed from the decision process of transactions,while reducing the amount of data transfer between modules and the waiting time of transactions.2.According to the characteristics of the graph database,the concurrency scheme of transactions is designed to distinguish attributes and relations for conflict judgment,which improves the degree of parallelism between transactions.The fault-tolerant recovery strategy of the system is designed in detail to ensure that data will not be lost in case of downtime,and to a certain extent to ensure that transactions will not be executed by the terminal due to server downtime.Implemented a load balancing strategy for modules in the case of large-scale concurrency to ensure that a single module does not become a bottleneck.3.Through the analysis of transaction logs,a high-performance distributed streaming data storage system is designed and implemented,which uses a memory parallel caching strategy and improves the LSM-Tree algorithm to solve the problem of read and write amplification;and redundant backup of data is performed based on the master-slave replication strategy;according to the application scenario,an efficient data caching strategy is implemented to achieve the multi-lane transaction log According to the application scenario,an efficient data caching strategy is implemented to achieve the design requirement of efficient access to multiple transaction logs.4.In the testing section,the two modules described above and the overall transaction system are tested for functionality and performance.The functional test results show that the transaction system and the modules meet the design requirements and are able to provide services correctly in the event of an exception in transaction execution.The performance test results show that the logging performance and transaction performance have met the intended goals.The test results are analyzed in detail at the end of the test in the context of application scenarios.
Keywords/Search Tags:Distributed transactions, Two-phase commit protocol, Deterministic database
PDF Full Text Request
Related items