Font Size: a A A

Cache And Index Key Technology Research Based On LSM-tree

Posted on:2022-12-25Degree:MasterType:Thesis
Country:ChinaCandidate:S ZhangFull Text:PDF
GTID:2518306743474264Subject:Computer technology
Abstract/Summary:PDF Full Text Request
Log-structure merge-tree(LSM-tree)has high write performance.It converts random writes to sequential writes through memory components,and then reorganizes data into Sorted String tables(Sorted String Table,SSTable).The disk component is a multi-layer structure.In the LSM-tree,the merge operation will merge the underlying data from time to time.It brings some performance problems,such as,cache invalidation,write magnification and read magnification.In order to improve these problems,this thesis proposes the following solutions:1)The two-phase parallel prefetching approach is proposed to reduce cache invalidation caused by merging operations.The root cause of the cache invalidation is that the merge operation reorganizes data of the disk components,which causes some of data in the block cache to become invalid and cannot be accessed later.In order to improve the problem of periodic cache hit ratio decline caused by cache invalidation,this thesis prefetches the data blocks that may be accessed in the future into the cache through the two-phase prefetching approach.Experimental results show that the twophase parallel prefetching approach can improve the cache hit ratio by about 2.65 times.2)An index structure design based on hash grouping is proposed to improve the write and read amplification of the LSM-tree.The LSM-tree is an index structure optimized for write performance.While pursuing high write performance,it also makes some concessions to read performance.First,the hierarchical storage structure of the underlying component enables the query operation to be conducted layer by layer using dichotomy.But the multi-layer structure also cause write amplification,which seriously affects the service life of SSD disks over time.By analyzing the causes of write and read amplification,this thesis proposes read and write tradeoff scheme based the hashing group,which can reduce write and read amplification by dividing data into multiple groups and the number of layers of disk components.Experimental results show that the index structure based on hash grouping can improve the read performance by about 11%.
Keywords/Search Tags:Key value storage, Cache invalidation, Cache prefetching, Hash partition, Index structure
PDF Full Text Request
Related items