Font Size: a A A

Research On Optimization Of Reading And Writing Efficiency Based On LSM-tree

Posted on:2023-08-14Degree:MasterType:Thesis
Country:ChinaCandidate:Z Y MaFull Text:PDF
GTID:2568306836976639Subject:Computer technology
Abstract/Summary:PDF Full Text Request
With the increase in network application data year by year,the storage of unstructured data has become a topic of general interest in the current storage system.KV storage system which takes LMS-tree(Log-Structured Merge-tree)as the mainstream architecture provide quality service for data-intensive loads.In order to improve read performance,the key-value system introduces Bloom filter based on LSM-tree.Configuring a Bloom filter for each SSTable greatly improves the reading performance of the system with the fast filtering ability of the Bloom filter.But the false positives of the bloom filter will also cause additional I/O requests.However,current key-value separation designs does not fully account for the speed difference between sequential and random writing.When the storage space is reclaimed,subsequent write operations will be affected and lead to the write pauses.In this paper,we present HSKV(Hash structure Segment KV Storage)though studying of the read and write performance of the key-value database.1)To accelerate the efficiency of writing,HSKV alter the in-memory write cache structure,segmenting it to store key-value pairs to be written.2)Due to the defect of read amplification in LSM-tree,to improve query performance,we propose a hash-based KV/KP caching mechanism based on selective KV separation.3)we propose a new garbage collection strategy which proposed to reduce write pauses.4)HSKV uses machine learning models for indexing,can quickly find the position corresponding to the key to query.5)The compaction process is optimized.In addition,we use Level hashing to perform the auxiliary storage of the compaction process,which can accelerate the compaction speed.The purpose of HSKV is to achieve high update performance over key-value separation under update-intensive workloads.HSKV uses hash-based data grouping,writes key-value pairs sequentially to storage,and caches hotter data in memory for faster reads and writes.Evaluating HSKV with some existing KV storage systems,HSKV can improve the update throughput by up to4.5×.
Keywords/Search Tags:LSM-tree, KV database, Bloom Filter, YCSB, KV separation
PDF Full Text Request
Related items