Font Size: a A A

Research And Implementation On Hybrid Compaction Mechanism Based On Log-structured Merge Tree

Posted on:2020-08-02Degree:MasterType:Thesis
Country:ChinaCandidate:J J LiFull Text:PDF
GTID:2428330590958317Subject:Computer system architecture
Abstract/Summary:PDF Full Text Request
In recent years,key-value stores based on Log-structured merge tree(LSM-tree)have been applied to various fields because of its simple structure and high write throughput.The two compaction strategies commonly used in LSM-tree are Tiered Compaction Strategy(TCS)and Leveled Compaction Strategy(LCS).TCS has overlapping key ranges in the same level,reduces data rewriting to maintain the sorted order,but the exponentially increasing size of Sorted String Tables(SSTables)will take up a lot of temporary storage space and potentially reduce query performance;LCS uses fixed Size SSTables,but in order to maintain the sorted order in the same level,data is rewritten frequently.First,in order to combine the advantages of the above two compaction strategies,we propose a hybrid compaction mechanism(HCM),which has overlapping key ranges in the same level and uses fixed Size SSTables.Specifically,HCM only merge sorts the current level SSTables and write to the next level.The SSTables' s key ranges in the same level may overlap,which reduces the data rewriting to maintain the sorted order.At the same time,the fixed size SSTables are used,and if the SSTable which to be merged has overlapping key ranges with the others SSTables which participates compaction,then excutes physical merge operation.If there is no overlapping key ranges,it will move directly to the next level,and no physical merge operation will be generated,thereby reducing write amplification.In addition,the merge process is monitored,and the merged physical table is deleted in time to reduce the space occupation in the compaction process.Second,we present two methods to optimize query performance: file-level bloom filter and parallel query mechanism.The file-level bloom filter accurately filters out physical tables that may have target keys,reducing unnecessary physical table query operations.The parallel query mechanism can implement multiple physical table concurrent queries,making full use of the high inherent parallelism in the SSD.Finally,we implement HCM on LevelDB.We evaluate the HCM with both db_bench and YCSB workloads.The experimental results show that HCM reduces write amplification by 3.0% and 96.4% compared to TCS and LCS,reduces the average query latency by 80.4% and 65% compared to TCS and LCS.
Keywords/Search Tags:Key-value Store, Log-structured Merge Tree, Hybrid Compaction Mechanism, FileLevel BloomFilter, Parallel Query Mechanism
PDF Full Text Request
Related items