Font Size: a A A

Design And Implementation Of A Main-memory Distributed Database

Posted on:2018-09-17Degree:MasterType:Thesis
Country:ChinaCandidate:B ZhangFull Text:PDF
Abstract/Summary:PDF Full Text Request
The database deployed in a production environment is primarily intended for OLTP.These databases are lack of support for OLAP,or paralyze transactions process when the analysis process is performed.In recent years,with the popularity of business intelligence applications,data warehousing are applied to various industries,which provide decision-making data support for the enterprises to make decision.The industry often used to periodically transfer data from the production environment into the data warehouse.However,due to the relative outdated data,in the highly sensitive data real-time industry,the data warehouse's off-line analysis is lack of persuasive,and it can not effectively support the decision-making.This paper studies the theory and method of OLTP and OLAP fusion.We design and implement a distributed distributed database based on memory,focusing on the processing scheme of deterministic transactions.We integrate the storage and function of OLTP system and OLAP system,under the premise of guaranteeing transaction process performance.The main work and innovations as follows:1,the system implements the integration of OLTP and OLAP.Our system proposes and implements the data processing scheme in the coexistence of baseline data and incremental data.The system can also provide real-time data for analysis.2,in our system,all data will be stored in memory instead of disk.We use Infiniband verbs to implements a RDMA communication library.All communication between our nodes use the RDMA instead of conventional TCP/IP ethernet.As a result,the system has a better performance.3,propose and implemente a deterministic distributed transaction processing framework.We use a globally ordered vector clock to deside the order between all distributed transactions and split them to single-parition transactions.All transactions execute on the partition node follow the order of the vector clock.And we eliminate the two-phase submission.For the concurrency control,a deterministic locking mechanism enables the elimination of traditional ways.4,for the fault tolerance,data are synchronously replicated through nodes via Paxos.This ensures that single-partition transactions have the same execution order on multiple replicas.Our method to establish checkpoint based on a virtual memory supported shadow paging so we can get a memory snapshots more faster than others.After testing,the basic function of the system and the performance of the transaction achieve the desired goal.The system is able to correctly store the incremental data and merge the incremental data and baseline data efficiently.
Keywords/Search Tags:main-memory database, transaction processing, determinism, replication, RMDA
PDF Full Text Request
Related items