Font Size: a A A

Research On Log-structured Merge Tree Based On Key Distribution

Posted on:2022-09-22Degree:MasterType:Thesis
Country:ChinaCandidate:X H DuanFull Text:PDF
GTID:2568306500950239Subject:Computer software and theory
Abstract/Summary:PDF Full Text Request
Nowadays Internet applications generate massive amounts of data at an astonishing speed,which brings huge challenges to storage system.The key-value pair model is widely used due to its flexible structure and strong scalability.Persistent keyvalue(KV)stores are commonly employed by data-intensive enterprise applications to process data requests.Persistent key-value(KV)stores are popularly organized as LogStructured Merge-trees(LSM-trees)to facilitate sequential write accesses for high performance.However,LSM-tree based KV stores are known to suffer write amplification and read amplification.To address the issue derived from the LSM-tree,we first deeply explore the adverse effect caused by merge-sort compaction.Then we found that the compaction will change the key distribution of corresponding data levels,and this inspired us to use key distribution law of LSM-tree to guide the compaction operations.Therefore we propose a light-weight metric,called gradient,to reflect the key distribution density and trend of the level and SST files.Through detailed motivational experiments,we summarize three observations around gradient changes to help optimize the performance of LSM-tree.We present Grad KV,a novel gradient-guided KV store that keeps track of gradients across its storage levels and exploits the changes to guide the allocation of resources for staging and compaction of KV pairs.To mitigate I/O amplifications,we have designed Grad KV with several key techniques including(1)staging pavilion for fast buffering of SSTables and batched compaction,(2)leap-forward compaction to reduce cascading compactions,and(3)dual indexing with an in-memory cache of keys for expedited read operations.Our evaluation shows that,compared to Level DB,Grad KV delivers up to 2.13 x,1.61 x,and 1.12 x throughput for puts,gets,and scans,respectively.
Keywords/Search Tags:LSM-tree, key-value store, key distribution, write amplification, read amplification
PDF Full Text Request
Related items