Font Size: a A A

Design And Implementation Of Scalable Distributed Deterministic Database

Posted on:2020-07-10Degree:MasterType:Thesis
Country:ChinaCandidate:Z Y DongFull Text:PDF
GTID:2428330620958866Subject:Software engineering
Abstract/Summary:PDF Full Text Request
Deterministic databases determine transactions' order in advance and execute the transactions according to the predetermined order deterministically.With the above features,deterministic databases can eliminate the overhead of distributed commit protocol and avoid the frequent rollbacks caused by contention.Compared with traditional databases,deterministic databases achieve high performance and good scalability in high contention.Therefore,deterministic databases have become a hot topic in academia and been tried in industry.Unfortunately,we find that deterministic databases have poor scalability in low contention.Specifically,for transactions' determinism,server nodes need to schedule transactions in a predetermined order,which limits scalability.Deterministic databases' implementation combines deterministic locking and Two-Phase Locking(2PL)to deterministically execute transactions.Once transactions are received,the server node analyzes their read-write sets and acquires deterministic locks for them considering the predetermined order.Thus,the determinism can be achieved by the ordered locking.Evaluation and analysis show that the scheduling of deterministic locking becomes the system's performance bottleneck in low contention.As a result,the overall performance of deterministic databases cannot scale to more than 4 worker threads.Aiming at the above problems,we provide the Deterministic and Optimistic Concurrency Control(DOCC),which ensures both deterministic transaction execution and high scalability.Inspired by Optimistic Concurrency Control,we divide transaction execution into Execution,Waiting,Validation and Commit Phase.We observe that by validating the transactions in the predetermined order,deterministic databases can also guarantee deterministic transaction execution.Based on the above observation,we design and implement DOCC,which allows transactions' parallel execution and only restricts transactions' validation and commit order,which improves the scalability.Meanwhile,we further optimize read-only transactions with multiversion,avoiding read-only transactions blocking the transaction processing.At the same time,to decrease the overhead of transaction's rollback,we propose a data prefetching optimization,eliminating the data indexing overhead during retry.We also provide an efficient garbage collection(GC)mechanism that uses different GC techniques to handle varying situations.Finally,based on above DOCC's algorithm and optimizations,we provide a scalable distributed deterministic database called DOCC-DB and describe its design and implement in detail.To show the performance improvement of DOCC,we use the TPC-C benchmark to evaluate DOCCDB and Calvin,a state-of-the-art distributed deterministic database.The evaluation indicates that DOCC-DB can achieves 8X performance improvement compared with Calvin.
Keywords/Search Tags:Distributed Database, Deterministic Database, Concurrency Control, Scalability
PDF Full Text Request
Related items