Font Size: a A A

Research On High Performance Key-value Stores For Systems With Emerging Storage Devices

Posted on:2021-10-16Degree:DoctorType:Dissertation
Country:ChinaCandidate:T YaoFull Text:PDF
GTID:1488306107956509Subject:Computer system architecture
Abstract/Summary:PDF Full Text Request
In the big data era,the development of emerging storage devices is tracing the ten-dency of either higher storage capacity or faster accesses.For example,shingled magnetic recording(SMR)is the most promising technology that drives higher disk areal density with lower cost;non-volatile memory(NVM)provides fast accesses with DRAM-like perfor-mance and disk-like persistency.Although these emerging storage devices are essential for further improving storage systems,they face nontrivial challenges.Among various storage systems,key-value stores(KVS)are the backbone of data center and multiple data-intensive applications,which plays a significant role in the area of web indexing,social networking,photo store,and cloud store due to its high service quality and responsive user experience.Hence,optimizing key-value stores with emerging storage devices is widely concerned by both academia and industry.Focus on optimizing key-value stores with emerging storage devices,an SMR based key-value store is first presented to eliminate the redundant garbage collections and improve access efficiency.Second,by utilizing NVM in hybrid systems with DRAM-NVM-SSD storage,an NVM based KV store is proposed to mitigate the tail-latency and write amplification.Third,for key-value stores on systems with single-level NVM,an indexing structure is introduced to achieve universal operational efficiency and high scalability.Host-managed shingled magnetic recording drives(HM-SMR)give a capacity advan-tage to harness the explosive growth of data.Applications where data is sequentially written and randomly read,such as key-value stores based on Log-Structured Merge Trees(LSM-trees),make the HM-SMR an ideal solution due to its capacity,predictable performance,and economical cost.However,building an LSM-tree based KV store on HM-SMR drives presents severe challenges in maintaining the performance and space efficiency due to the redundant cleaning processes for applications and storage devices(i.e.,compaction and garbage collections).To eliminate the overhead of on-disk garbage collections(GC)and improve compaction efficiency,a GC-free KV store tailored for HM-SMR drives,called Gear DB,is presented.Gear DB proposes three new techniques:a new on-disk data lay-out,compaction windows,and a novel gear compaction algorithm.Gear DB is implemented based on Level DB and evaluated on the system with a real HM-SMR drive.The extensive experiments have shown that Gear DB achieves both good performance and space efficiency,i.e.,on average 1.71×faster than Level DB in random write with a space efficiency of 90%.Popular LSM-tree based key-value stores suffer from suboptimal and unpredictable per-formance due to write amplification and tail latency.The preliminary experimental studies reveal that(1)tail latency mainly stems from the significantly large amount of data involved in each compaction between L0-L1(i.e.,the first two levels of LSM-tree),and(2)write am-plification increases with the depth of LSM-trees.To address the two limitations by exploit-ing NVM,a new LSM-tree based KV store for systems with multi-tier DRAM-NVM-SSD storage,called Triangle DB,is presented.Triangle DB introduces two novel techniques.First,L0level is relocated and managed the in NVM by the triangle structure,and the fine-grained L0to L1compaction is proposed to reduce tail latency.Second,based on the triangle struc-ture,Triangle DB introduces three techniques to reduce write amplification,keep adequate read performance,and guarantee NVM data consistency.Triangle DB is implemented based on Rocks DB and evaluated on a hybrid DRAM/NVM/SSD system using Intel's latest 3D Xpoint NVM device Optane DC PMM.Evaluation results show that Triangle DB achieves3.82×random write throughput than Rocks DB.NVM provides a promising medium for building future memory systems.Its larger capacity and close-to-DRAM latencies call for key-value stores with better scalability and efficiency.However,the preliminary experimental studies show that existing indexing struc-tures in key-value stores face two common challenges,i.e.,lack of universal operational ef-ficiency and scalability.To solve the above two challenges,a three-tiered indexing structure with a non-intrusive background resizing policy,called Combo Tree,is presented.In Com-bo Tree,the middle tier,Tier B,is a sorted array of ordered keys,in which every two adjacent keys form a subrange,dividing the overall key space into multiple narrowed key ranges.The bottom tier,Tier C,consists of multiple space-saving B+Trees.Each B+Tree belongs to a subrange of Tier B,serving requests falling within this subrange.The top tier,Tier A,uses the cumulative distribution function(CDF)to quickly identify the location of a given key in Tier B at a constant time complexity.A background resizing policy is proposed to maintain sustained good performance as the capacity of the indexing structure grows.Combo Tree is implemented and evaluated on Intel's Optane DC PMM.Test results show that Combo Tree achieves 2.9×random write and 1.8×random read performance than the state-of-art ordered indexing structures.
Keywords/Search Tags:Shingled Magnetic Recording, Non-volatile Memories, Key-value store, Log-structured merge tree, Indexing structure
PDF Full Text Request
Related items