Font Size: a A A

Research On Optimizations Of Buffer Management And Storage Engines For Key-Value Stores

Posted on:2023-01-12Degree:DoctorType:Dissertation
Country:ChinaCandidate:X L WangFull Text:PDF
GTID:1528306902464084Subject:Computer application technology
Abstract/Summary:PDF Full Text Request
With the rapid development of Internet technology,and the maturity of cloud computing,Internet of Things(IoT),artificial intelligence(AI)and other technologies,the data scale of applications has shown explosive growth,such as:e-commerce,social networks,online games and so on.Along with the scale of data increasing,requirements of the system performance for data access are getting higher and higher.How to efficiently handle massive data is a huge challenge for database systems.Traditional relational databases can no longer meet the needs of applications due to their own limitations such as complex relational models and strong consistency.Key-value databases have the advantages of simple interfaces,low latency,high throughput,and strong scalability,which have been widely used in various complex scenarios and are an important means to deal with massive data.However,the performance of existing key-value databases does not fully meet the needs of applications.In the cache module,key-value databases suffer from low memory utilization and cannot be efficiently applied to hybrid memory.In the storage module,key-value databases suffer from serious write amplification problem because they are mostly based on LSM-tree(Log-Structured Merge-tree).In this paper,we focus on the read/write optimization of key-value databases,and investigate the buffer module and the storage module of key-value databases,aiming to significantly improve the performance of key-value databases through the design and optimization of buffer managers and LSM-tree-based storage engines,and provide a new reference for the future development of key-value database technology.The main work and contributions of this paper can be summarized as follows.(1)We study and propose a new multi-grained buffer manager named MG-Buffer,which improves the efficiency of the buffer manager in key-value databases.Traditional buffer managers adopt page-based buffer pools.In this paper,we propose MG-Buffer,a multi-grained buffer manager that combines the page-based pool and the tuple-based pool.It modifies data structures such as page descriptors to record tuple access information in pages to identify hot and dirty tuples in pages,and ensures that hot and dirty tuples are not swapped out during migration operations.We experimentally compare MG-Buffer with various existing buffer managers and show that MG-Buffer can improve the hit ratio by 20%while reducing the running time by 20%.(2)We study and propose an adaptive buffer manager named AMG-Buffer,which enables key-value databases to adapt to dynamic workloads and improves the adaptive ability of the buffer manager.We observe that tuple-based and page-based buffer managers have their own pros and cons,so the buffer manager with a fixed-size buffer management granularity cannot adapt to dynamic workloads.Therefore,based on the page-based and tuple-based buffer managers,we propose a method to dynamically select the buffer management granularity by the clustering rate of pages,and accordingly design an adaptive buffer manager named AMG-Buffer.AMG-Buffer can dynamically change the internal data layout based on the changes of workloads to improve the overall performance.The experimental results on various workloads demonstrate that the high performance and adaptiveness of AMG-Buffer.(3)We study and propose a buffer manager towards hybrid memory(DRAM+Persistent Memory)named HiBuffer,which reduces the write traffics to PM while keeping the hit ratio and improves the overall performance of buffer managers in key-value databases.Considering the characteristics of hybrid memory architecture,we design a new buffer management scheme,HiBuffer,which absorbs write traffics through a new buffer structure and a DRAM-based out-of-place update strategy,thus reducing the number of writes to persistent memory.Also,HiBuffer proposes a concurrent read mechanism that alleviates the extra read I/O caused by synchronous operations.We conduct an experimental environment based on real Intel Optane PM and compare HiBuffer with various existing strategies,and the experimental results prove the effectiveness of HiBuffer.(4)We study and propose a block-based compaction strategy for LSM-tree engines and build a new key-value storage engine named BlockDB,which effectively solves write amplification and cache invalidation problem caused by LSM-tree compaction operation and improves the performance of LSM-tree-based key-value storage engines.Unlike the traditional compaction operation based on SSTable files,we propose a new compaction method(Block Compaction)based on data block granularity.This method effectively reduces write amplification caused by compaction operation by reusing data blocks.In addition,we propose three optimization techniques for Block Compaction,namely Selective Compaction,Parallel Merging,and Lazy Deletion.Finally,we implemented BlockDB,a new storage engine based on the block merging strategy,on top of LevelDB,and compared it with several existing storage engines,including LevelDB,RocksDB,and L2SM.The experimental results show that BlockDB can reduce write amplification by 32%and running time by 46%,and outperforms the comparison systems in both point and range queries.
Keywords/Search Tags:Key-Value Store, LSM-tree, Buffer Manager, Persistent Memory
PDF Full Text Request
Related items