Font Size: a A A

Design And Implementation Of NVM-based Distributed Backend Storage

Posted on:2022-11-13Degree:MasterType:Thesis
Country:ChinaCandidate:W WangFull Text:PDF
GTID:2518306764480184Subject:Computer Software and Application of Computer
Abstract/Summary:PDF Full Text Request
The distributed storage system provides scalable,highly reliable,and highly available storage services,which is composed of client and server clusters.Distributed backend storage is a software module on the server that manages disk capacity.Nonvolatile memory(NVM)has the characteristics of byte addressability and read-write asymmetry,which is provided with DRAM-like access to data.With the production of Optane persistent memory,many predecessors have proposed index data structures optimized for persistent memory,but it still requires a lot of research and exploration to apply them in the industry.Based on in-depth research on non-volatile memory and the backend storage engine—Blue Store,this thesis proposes the design and implementation of a distributed backend storage that can be applied in current production environment.Blue Store takes Rocks DB as the database engine to hold metadata and deferred write data,so as to satisfy the requirements of consistency in distributed storage systems.However,the LSM-Tree structure on which Rocks DB relies,has serious read-write amplification problems.In addition,the traditional write-ahead log data falls to disk in a long path and cannot be written concurrently,which is a bottleneck for write operations.In order to solve the bottleneck problem,in this thesis,the write-ahead log and block cache,which lies on the critical path of reading and writing,are stored in the persistent memory.Meanwhile,this thesis customizes a space management scheme for non-volatile memory based on data features,so that to improve the overall performance of distributed backend storage.The specific work and innovations of the thesis are as follows:1)Draw on the experience of Blue Store,taking Rocks DB as the database engine,and customizing a lightweight user-mode log file system for it,which could reduce the overhead caused by the traditional file system.2)By using micro-instructions that bypass the CPU cache,with the Mmap-DAX mode,the write path of the write-ahead log is greatly shortened.At the same time,adjusting the format of the write-ahead log could take fully advantage of the byte-addressability of persistent memory.In addition,binding the column family to the write-ahead log,which then could separate the life cycle of log entry,make concurrent write operations of the write-ahead log a realization.3)The DRAM-NVM two-level cache structure is applied to release the DRAM cache space.Meanwhile,the persistent cache layer could improve the read efficiency and alleviate the problem of read amplification.Moreover,A persistent memory cache algorithm evaluator is established depending on the network flow to model cache access requests.The evaluator provide a result of the minimum cost maximum flow algorithm,which could help the user make a better choice on cache replacement strategies for persistent memory.4)Use differentiated space management for write-ahead log and persistent cache.The write-ahead log adopts huge page to reduce metadata overhead.While,the persistent cache performs space allocation and recovery depending on the heap manager,which stores metadata in the DRAM,so that to reduce persistent memory write operations and extend the lifetime of NVM.5)Test the NVM-based distributed backend storage.Compared with Blue Store,the write performance of NVM-based backend storage is an order of magnitude higher,with a write bandwidth of 2.1GB/s and an average latency of 410 ns.The test data of NVM-based distributed backend storage is in line with the benchmark test of PMem100,showing that the design and implementation of distributed backend storage fully utilizes the persistent memory.
Keywords/Search Tags:Distributed Backend Storage, Persistent Memory, BlueStore, RocksDB, LSM-tree
PDF Full Text Request
Related items