Font Size: a A A

Design And Implementation Of Transaction System Based On Distributed Columnar In-memory Database

Posted on:2022-01-01Degree:MasterType:Thesis
Country:ChinaCandidate:F HanFull Text:PDF
GTID:2518306524489694Subject:Master of Engineering
Abstract/Summary:PDF Full Text Request
After years of development,the field of distributed database is gradually subdivided.According to the way of data processing,the database can be divided into transaction oriented database(OLTP),data analysis oriented database(OLAP)and relatively new hybrid database(HTAP).The development of the TP and AP has been relatively mature.Although there are many HTAP implementation schemes in the market,most of them are based on hybrid row/column storage.Transaction implementation still relies on row storage engine,and column storage's data is obtained synchronously from row storage.Therefore,the transaction implementation scheme directly oriented to column storage is still relatively rare.In this thesis,we design a distributed transaction scheme based on OLAP full memory distributed database.The purpose is to provide read-write support for read-only column storage engine and build a distributed transaction system based on full column storage.The main contents are as follows:1.A MVCC scheme is designed.The model does not conflict with reading and writing,and supports garbage collection.Based on this,we use skiplist to implement an efficient incremental index which supporting no lock insertion.The incremental index also has inverted index and forward index of key value semantics,which supports search function based on version number.2.Transform the original column groupkey index to support multi version storage,try to design a scheme,which use GPU to accelerate the construction and reading of multi version groupkey.3.A read-write hybrid engine based on incremental index and multi version groupkey index is implemented,which supports the transaction function of local KV semantics.Design a data fusion scheme,update the incremental index data to the groupkey index,and adopt the double index mode,incremental data fusion does not block the execution of external system transactions.In addition,the version management function of transaction granularity and single fragment is implemented to cooperate with the old version garbage collection of hybrid engine.For the original AP computing engine,it provides interface support.4.Based on the hybrid storage engine,the distributed transaction engine is implemented,which including transaction ID generation,transaction status management,log management,metadata management,and the implementation of distributed twophase commit function modified by Percolator.The distributed transaction engine adopts the way of multi coordinator to avoid single point of failure,and realizes the semantics of repeatable-read based on the concurrency control protocol by version number management.In addition,it supports pessimistic,optimistic and local transaction semantic interfaces.Finally,this thesis implements a hybrid storage engine and a distributed transaction system based on transaction storage module,and then carries out functional verification and related performance tests,finally demonstrates the correctness of some designs in this system.
Keywords/Search Tags:MVCC, HTAP, in-memory database, column storage, distributed transaction
PDF Full Text Request
Related items