Font Size: a A A

Research On Optimizations For In-memory Key-value Database With RDMA And NVM

Posted on:2022-11-06Degree:DoctorType:Dissertation
Country:ChinaCandidate:X C QiFull Text:PDF
GTID:1488306773982769Subject:Computer Software and Application of Computer
Abstract/Summary:PDF Full Text Request
With the rapid development of internet of things,artificial intelligence,cloud computing,and mobile internet,the scale of data have shown explosive growth.This phenomenon brings great challenges to data management,and in-memory key-value databases are the key infrastructure to support data centers and data-intensive applications,which are widely used in web search,e-commerce,cloud storage,social networks,and other fields.In the era of big data,applications generate huge amount of user data at all times,which puts high requirements on the performance and high availability of in-memory key-value databases.The development of in-memory key-value databases is not only driven by applications but also needs to be optimized with the emerging new hardwares to obtain the best access performance.The emergence of high-speed network supporting Remote Direct Memory Access(RDMA)and Non-volatile Memory(NVM)storage device brings new opportunities and challenges to further enhance and optimize in-memory key-value databases.This dissertation focuses on the optimizations for in-memory key-value databases using RDMA and NVM.RDMA is a high-bandwidth,low-latency network communication technology that can accelerate remote network communication of in-memory key-value databases,and NVM can significantly improve key-value access efficiency,combining the persistence of traditional hard disks with near-DRAM access rate.However,existing inmemory key-value databases are designed for legacy hardwares,migrating them directly to RDMA and NVM cannot realize the full potential of the new hardwares.Therefore,it is worthwhile to investigate how to fully utilize the advantages of new hardwares to design high-performance and highly-available in-memory key-value databases.This dissertation firstly analyzes the bottlenecks of the in-memory key-value databases in terms of high performance and high availability,and the challenges to integrate with new hardwares.Then,this dissertation proposes an optimization scheme for in-memory key-value databases based on an event-driven communication framework,an optimization scheme for primary-backup replication based on a remote log persistence mechanism,and an optimization scheme for consensus protocol based a follower-driven manner:(1)Optimization for in-memory key-value databases based on an event-driven communication framework: RDMA-capable network is a promising medium to enhance in-memory key-value databases but its low-level network API integration into existing in-memory key-value databases requires significant code modification and architecture redesign.And the existing system design for traditional Ethernet,such as serialization protocol,is not suitable for RDMA-capable network.Therefore,in order to efficiently exploit RDMA network and fully utilize RDMA features,this dissertation selects the best RDMA primitive and constructs an event-driven RDMA RPC that supports an efficient request notification mechanism rather than polling requests to decrease the CPU consumption.This dissertation provides a generic network interface that can be easily integrated into existing in-memory key-value databases.This dissertation also proposes an optimized serialization protocol that minimizes the memory copy overhead of serialization and deserialization.In addition,this dissertation proposes a parallel task engine based on optimistic concurrency control,which is friendly to read-intensive loads and can better improve the performance of in-memory key-value databases.(2)Optimization for primary-backup replication based on a remote log persistence mechanism: In-memory key-value databases often utilize primary-backup replication to support high availability.RDMA-capable network and NVM storage can effectively eliminate the slow network and storage I/O bottlenecks in primary-backup replication.However,because of the DDIO technology,the RDMA NIC writes data directly to the CPU cache of the remote node instead of the remote NVM.Once the remote node fails,it causes data inconsistency problem.Moreover,the real NVM device has read and write amplification problems.To solve these challenges,this dissertation proposes an efficient remote log persistence mechanism that fully utilizes RDMA features using minimal network round-trip and persistence overhead.To alleviate the NVM write amplification problem,this dissertation designs a log structure-based storage with a pipelined batch processing mechanism and parallel log replay,thus minimizing the impact of log replication on the performance of the primary node.In addition,this dissertation further proposes a hotness-aware differential hash index to provide good read performance.(3)Optimization for consensus protocol based a follower-driven manner:Consensus protocols are widely used by commercial in-memory key-value databases for high availability that not only ensure strong data consistency but also can automatically elect a new leader to take over the service in case of the leader failure.However,the single-point leader is easy to become a bottleneck because it needs to handle a large number of client requests and replicate the logs to all followers as well.To address this challenge,this dissertation proposes a follower-driven Raft protocol that exploits hybrid RDMA primitives to relieve part of the burden of leader to followers in log replication,effectively reducing the CPU and network overhead of the leader.Further,the protocol takes advantage of the byte-addressable feature of NVM and supports a follower-driven log chase.In addition,this dissertation also proposes a quorum follower read that enables followers to handle read requests without the involvement of the leader,further reducing the heavy resource overhead of the leader.In summary,this dissertation combines the new RDMA and NVM hardwares and illustrates how to optimize the in-memory key-value databases by fully exploiting the features of the new hardwares to achieve the goal of high performance and high availability from three aspects: the optimization for communication framework,the optimization for primary-backup replication,and the optimization for consensus protocol.First,this dissertation constructs a high-performance and low-latency RDMA communication framework for in-memory key-value databases to accelerate the network communication between clients and the server.Then,in order to solve the remote data persistence problem and the NVM write amplification problem,this dissertation proposes an efficient remote log persistence mechanism and an NVM-friendly storage structure for optimizing in-memory key-value databases using primary-backup replication.Finally,this dissertation proposes a follower-driven optimized Raft protocol to provide high-available services for in-memory key-value databases.Extensive experimental results prove the effectiveness of the optimization schemes proposed in this dissertation.
Keywords/Search Tags:In-memory Key-value Database, Remote Direct Memory Access, Primarybackup Replication, Consensus Protocol
PDF Full Text Request
Related items