Font Size: a A A

Research And Implementation Of LSM-tree Based Key-value Store

Posted on:2022-12-25Degree:MasterType:Thesis
Country:ChinaCandidate:Z X FangFull Text:PDF
GTID:2518306743951759Subject:Master of Engineering
Abstract/Summary:PDF Full Text Request
With the advent of the big data era,the field of data storage has shown the trend of increasing data scale and rapid growth of unstructured data,which makes it difficult for traditional relational databases to meet the data storage requirements of many Internet applications.Keyvalue storage systems are widely used in various data-intensive applications because of its flat data organization and excellent horizontal scalability.However,with the continuous growth of data scale and read/write requests,key-value storage systems also suffer from write amplification and read amplification problems.This thesis develops a study on these,focusing on optimizing the performance of key-value storage based on LSM-tree(Log Structured Merge Tree).The main research work and contributions of this thesis are as follows:(1)A differentiated management method based on key-value type is proposed: Although the write performance of LSM-tree is excellent,in order to maintain the read performance of the system and reduce space amplification,it is necessary to perform compaction operations on the data on disk periodically.And the compaction operation generates a large amount of disk I/O overhead,leading to severe write amplification,which in turn affects the system throughput.This thesis proposes a differentiated key-value management mechanism,which classifies key-value pairs according to their value sizes,implements different storage and garbage collection schemes for different types of key-value pairs,and implements Luc Key,a storage system based on this mechanism,to reduce the write amplification problem while ensuring the range read performance.(2)An adaptive cache optimization mechanism is proposed: Although Luc Key reduces the write amplification problem through the differentiated and fine-grained key-value management mechanism,the system still has the problem of read amplification due to the structural characteristics of LSM-tree.To further improve the read performance,this thesis proposes an adaptive caching mechanism and implements it based on Luc Key.The adaptive cache mainly includes two components,Hot Cache and Warm Cache,which are used to cache the data with higher access frequency and the data with lower access frequency,respectively,and can dynamically adjust the size of both according to the change of workload.For different types of key-value pairs,the objects stored in the cache are different,and this differentiated caching strategy based on the type of key-value pairs achieves a good balance between performance and space usage.(3)Experimental validations of the above proposed methods were conducted.Experiments show that Luc Key with differentiated key-value type-based management achieves higher performance in point read,write,and range read,and has more balanced performance in mixed workloads;adaptive cache optimization further improves Luc Key's read performance and adaptively adjusts the cache size with workload changes.
Keywords/Search Tags:Key-Value Store, LSM-tree, Cache
PDF Full Text Request
Related items