Font Size: a A A

Towards High Performance Key-Value Store

Posted on:2022-02-28Degree:MasterType:Thesis
Country:ChinaCandidate:L J YouFull Text:PDF
GTID:2518306323978649Subject:Computer system architecture
Abstract/Summary:PDF Full Text Request
With the development of the Internet in recent years,the field of data storage has shown new trends such as rapid growth of data scale and increasing percentage of un-structured data,which makes the traditional relational database unable to meet the grow-ing data storage needs.The key-value store is widely used in different data-intensive applications due to its simple and flexible interface and good performance,and lots of research work has been carried out at home and abroad.In this paper,we focus on the memory usage of key-value storage system from two aspects:on the one hand,LSM-Tree,as a widely used data structure in key-value stores,is organized in a hierarchical structure on disk,which causes users to look up data in multiple levels when query-ing,and reduces memory cache efficiency,both lead to worse query performance.On the other hand,with the rise of cloud services,more and more users choose to deploy key-value stores in virtualized environments,in which case we can perform memory deduplication in the host machine to ensure the memory usage of different virtual ma-chines with limited physical memory,but this process itself consumes a large number of system resources.LSM-Tree Query Performance Optimization:Existing research on workloads shows that workloads with hot range access characteristics are common to key-value stores,so to address the first issue,this paper proposes.an inter-level compact design based on hotness awareness(HaLC-KV),whose basic idea is to dynamically reduce the number of levels within hot ranges at runtime,which improves the query performance of the system without increasing the memory usage.By recording the access frequency of each SSTable file in the last level of the LSM-Tree,HaLC-KV can identify the hot ranges under the current workload,and then dynamically reduce the number of levels of these hot ranges,which reduces the index lookup overhead during a query,and achieves better memory cache efficiency.HaLC-KV also allocates a separate memory buffer for each hot range to buffer the newly written data,and merges the buffer directly with the last level of SSTable when the buffer size reaches the threshold,to avoid increasing the number of levels due to direct flushing newly written data to level 0.We have implemented HaLC-KV on top of LevelDB and conducted several experiments.The experimental results show that HaLC-KV can reduce the latency of point query and range query by 16.4%and 27.1%,respectively,compared with LevelDB.Lightweight Memory Deduplication:To address the second issue,in order to re-duce the system resource consumption of host memory deduplication when deploying a key-value storage system in a virtualized environment,we analyze the workflow of the KSM and find that its high system resource overhead is mainly caused by its use of red-black trees as indexes for in-memory pages and directly using the page contents as keys in those red-black trees.Therefore this paper proposes a lightweight memory dedupli-cation design based on tries(AMT-KSM),whose basic idea is to divide each memory page into segments,compute a hash value of each segment separately,and use the con-catenated strings of the hash values of segments as keys to index in the tries.In this way,we can greatly reduce the time needed to search for duplicate pages and the number of page comparisons.We implemented our design in the Linux Kernel and conducted ex-periments to evaluate our design.The experimental results show that AMT-KSM can significantly reduce CPU utilization and memory bandwidth usage during deduplication by 44.9%and 31.6%,respectively,compared to the conventional KSM.
Keywords/Search Tags:Key-Value Store, LSM-Tree, Memory Management
PDF Full Text Request
Related items