Font Size: a A A

Research And Implementation Of Key-Value Database Based On Separation Of Computing And Storage

Posted on:2020-12-03Degree:MasterType:Thesis
Country:ChinaCandidate:Y F HeFull Text:PDF
GTID:2428330596476539Subject:Engineering
Abstract/Summary:PDF Full Text Request
In recent years,with the popularity of the Internet and the rapid development of big data,the business of cloud service providers and e-commerce companies has expanded,resulting in explosive growth of data volume,and the storage and management of massive data has brought severe challenges to the database.A notable feature of the traditional distributed database architecture is that computing(query processing,query optimization,etc.)is physically tightly coupled with storage(durable,backup,fault recovery).But in this mode,computing and storage compete for resources,which results in mutual constraints on performance,and it is also difficult for the cluster to achieve rapid and flexible expansion.In order to solve the above problems,this thesis makes an in-depth study,opens the"black box" of database,and studies how to realize the separation of computation and storage on the distributed database.Specifically,the computing logic such as query parsing and query optimization is physically decoupled from the storage services such as durable,backup,and fault recovery,so that the computing layer is stateless and the storage layer is pooled.At the same time,NoSQL database has been widely used in massive data management in recent years because of its high availability and scalability.The key-value data model,one of the most important branches of NoSQL,is used to implement the database system in this thesis.In this thesis,the following work has been completed:(1)In-depth study of the significance and feasibility of database computing and storage separation,investigate the current development status of the separation of computing and storage database at domestic and abroad,and complete the overall architecture design and module detailed design of the system on the basis of sufficient research work.(2)For the network IO bottleneck caused by the separation of computing and storage,the design concept of "The log is the database" is adopted,which means that only the log is transferred between the computing layer and the storage layer,and no dirty data is transmitted.The storage layer parses the log itself to apply data background,which can effectively reduce the network load,and does not affect the query calculation of the computing layer.(3)To solve the problem of obsolescence of cache data in computing layer caused by separation of computation and storage,a reasonable cache consistency strategy is proposed.Caching mechanism is introduced in the computing layer,which can effectively reduce the network overhead of computing layer accessing storage layer in database query(4)Implement the prototype system and test the function and performance of the system.In the performance test,the system of this thesis is compared with Pegasus,a distributed Key-Value storage system with tight coupling mode.The test results prove that the system modified by the computing and storage separation mode meets expectations and achieves a better level.
Keywords/Search Tags:Distributed database, Compute and Storage Separation, Log Synchronization, Cache, Consensus Algorithm
PDF Full Text Request
Related items