Research On Optimization Of Reading And Writing Efficiency Based On LSM-tree

Posted on:2023-08-14

Degree:Master

Type:Thesis

Country:China

Candidate:Z Y Ma

Full Text:PDF

GTID:2568306836976639

Subject:Computer technology

Abstract/Summary:

PDF Full Text Request

With the increase in network application data year by year,the storage of unstructured data has become a topic of general interest in the current storage system.KV storage system which takes LMS-tree(Log-Structured Merge-tree)as the mainstream architecture provide quality service for data-intensive loads.In order to improve read performance,the key-value system introduces Bloom filter based on LSM-tree.Configuring a Bloom filter for each SSTable greatly improves the reading performance of the system with the fast filtering ability of the Bloom filter.But the false positives of the bloom filter will also cause additional I/O requests.However,current key-value separation designs does not fully account for the speed difference between sequential and random writing.When the storage space is reclaimed,subsequent write operations will be affected and lead to the write pauses.In this paper,we present HSKV(Hash structure Segment KV Storage)though studying of the read and write performance of the key-value database.1)To accelerate the efficiency of writing,HSKV alter the in-memory write cache structure,segmenting it to store key-value pairs to be written.2)Due to the defect of read amplification in LSM-tree,to improve query performance,we propose a hash-based KV/KP caching mechanism based on selective KV separation.3)we propose a new garbage collection strategy which proposed to reduce write pauses.4)HSKV uses machine learning models for indexing,can quickly find the position corresponding to the key to query.5)The compaction process is optimized.In addition,we use Level hashing to perform the auxiliary storage of the compaction process,which can accelerate the compaction speed.The purpose of HSKV is to achieve high update performance over key-value separation under update-intensive workloads.HSKV uses hash-based data grouping,writes key-value pairs sequentially to storage,and caches hotter data in memory for faster reads and writes.Evaluating HSKV with some existing KV storage systems,HSKV can improve the update throughput by up to4.5×.

Keywords/Search Tags:

LSM-tree, KV database, Bloom Filter, YCSB, KV separation

PDF Full Text Request

Related items

1	Content Synchronization In Distributed Systems
2	The Design Of Bloom Filter Algorithm For Key-value Storage
3	Privacy Preserved Bloom Filter And Key-value Based Bloom Filter
4	The Research On Bloom Filter Based On The Tree Structure
5	Research And Application Of Data Deduplication Technology Based On Bloom Filter
6	Research On Sampling Algorithm In Network Traffic Measurement
7	Multi-Bloom-Filter Query Algorithms And Their Applications
8	Research And Application Of Bloom Filter In Duplicated Webpages Deletion
9	A Fast IP Lookup Algorithm Based On Pivot-pushing And Bloom Filter
10	Towards Efficient Read For LSM-tree-based Key-Value Stores