Font Size: a A A

Research On Range Filter For Key-Value Stores

Posted on:2019-01-17Degree:MasterType:Thesis
Country:ChinaCandidate:X Q WangFull Text:PDF
GTID:2428330545476738Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
With the advent of the era of mobile internet,the amount of data is booming,while the data itself is becoming more unstructured.These new trends have brought great challenges.Meanwhile,key-value store systems have shown striking advantages and become a hotspot in research,for its simplicity,scalability and performance.State of art key-value store systems mainly adapt LSM-Tree structure in consideration of write performance.While this structure does boost the write performance,it will also have some negative influence on read operations,especially range queries,because this will result in many invalid disk access.Many key-value store system use bloom filter to improve point query performance.However,since bloom filter exploits hash functions,the order information of data is ruined,thus it cannot optimize range query process,which has become the shortcoming of many key-value store systems.Targeted at the above issues,this paper proposes a range filter for key-value stores in order to improve range query efficiency.The main contributions of this article are as follows:1.A range filter is designed and implemented considering the characteristics and re-quirements of key-value storage systems.The body of the filter is a binary tree,each node corresponds to a range,meanwhile,each leaf node records whether or not any key-value pairs fall within its corresponding range.With this structure,key-value stores can filter range queries.Then through theoretical performance analysis,we reveal the relationship between the filter performance and the memory budget.On this basis,in order to improve filtering performance and better handle different storage contents and workloads,a dynamic adjustment scheme is designed and implemented for the range filter so that the subject binary tree can dynamically grow and shrink according to the actual workload,thereby improving the filtering efficiency.2.The performance of the range filter is evaluated by experiments.Experimental re-sults show that in most cases,the range filter can achieve a low false positive rate of range queries with a small memory cost,thus achieve a good filtering result.A se-ries of optimizations were performed on the range filter in response to the high false positive rate brought by long strings observed in experiments.Based on the charac-teristics of some typical actual application scenarios,a common prefix processing mechanism and a filter autonomous splitting mechanism are designed and imple-mented.The former reduces the length of keys needed to be processed by the range filter by extracting the common prefixes of the keys;the latter further enhances the effect of extracting the maximum common prefix by making the range filters dynamically fragment,and addresses the issue caused by long strings in a targeted manner.The experimental results show that these optimizations can greatly reduce the false positive rate of the range queries and significantly improve the filtering result.
Keywords/Search Tags:Key-Value Store, Range Query, Filter, Binary Tree, Dynamic Segmentation
PDF Full Text Request
Related items