Font Size: a A A

Towards Efficient Read For LSM-tree-based Key-Value Stores

Posted on:2019-07-06Degree:MasterType:Thesis
Country:ChinaCandidate:Y M ZhangFull Text:PDF
GTID:2428330542494232Subject:Computer application technology
Abstract/Summary:PDF Full Text Request
Persistent key-value stores play a critical role in data-intensive applications,and their most commonly used design is LSM-tree.LSM-tree groups key-value pairs into a large number of fixed-size files.These files are called SSTables,which are further organized into multiple levels.When performing a read operation,a key-value store needs to issue multiple I/O requests to check the SSTables of different levels until the target key-value pairs are obtained.Excessive I/O requests result in poor read perfor-mance.To improve read performance,the key-value store configures a Bloom filter for each SSTable.However,Bloom filter has false positive,and the high false positive rate makes the Bloom filter unable to improve the read performance effectively.In addition,existing designs adopts uniform settings for all bloom filters,and they are unable to dy-namically adjust the false positive rate of the bloom filter.The method of uniformly reducing the false positive rate will bring huge memory overhead to the system.To improve read performance and reduce the amount of memory space used by Bloom filter,we propose ElasticBF(Elastic Bloom Filter).ElasticBF constructs mul-tiple small filters for each SSTable.Different false positive rates can be obtained for combining different number of small filters.ElasticBF dynamically loads small filters as needed for each SSTable according to the access frequency,which implements a fine-grained and elasticity Bloom filter mechanism.In order to load a suitable number of small filters for each SSTable,we aim to minimize the I/O requests caused by false positives of Bloom filters,and based on the access frequency of each SSTable,we have designed a Bloom filter adjustment rule.In order to achieve dynamic adjustment of the filter with a small overhead,we have extended the data structure of Multi-Queue and used it to manage the Bloom filters.Therefore,ElasticBF is able to dynamically adjust the false positive rate and memory usage of filter of each SSTable.It can greatly reduce(up to 67%in the experiment)I/O requests due to false positives of Bloom fil-ters in the read operations with the same memory usage.So it can greatly improve read performance.We implemented ElasticBF based on LevelDB and compared per-formance with LevelDB.The experiment results show that,ElasticBF can achieve up to 2.24Śread throughput compared to conventional Bloom filter design in LevelDB,and preserves the same write performance.During the integration of ElasticBF into the system,we further studied the con-figuration of small filters to improve the read performance during system startup and reduce the impact of overhead of dynamic adjustment on read performance.First,based on the characteristics of LSM-tree,we propose the Heterogeneous Bloom Filter(HBF).Subsequently,the model was established to quantitatively describe the relationship be-tween the system read performance and the Bloom filter configuration.Finally,we use the dynamic programming algorithm to solve the optimal configuration scheme of heterogeneous Bloom filter.The configuration scheme can not only guide the config-uration of small filters in ElasticBF,but also can be used to improve read performance in key-value stores.
Keywords/Search Tags:Key-value store, LSM-tree, Bloom filter, False positive rate
PDF Full Text Request
Related items