Font Size: a A A

Design And Optimization Of LSM-tree Storage Cache For Database Load

Posted on:2022-11-16Degree:MasterType:Thesis
Country:ChinaCandidate:M Q ZhuFull Text:PDF
GTID:2518306752454384Subject:Master of Engineering
Abstract/Summary:PDF Full Text Request
With the development of the network,a lot of write-intensive loads have led to an increasingly large scale of data,which has brought huge challenges to the computing and storage capabilities of the database system,so a batch of distributed databases with separate computing and storage have emerged.Most of them are based on the LSM-Tree storage architecture,and use the append write method to obtain good write performance.The Key-Value storage mode allows the database to have better scalability.However,the multi-layer storage structure of the LSM-Tree storage architecture increases the query path,and one query will cause multiple disk I/Os.In the LSM-Tree storage architecture,caching is one of the main methods to improve read performance,but there is not much work on the architecture of caching.The memory table and cache in the LSM-Tree storage architecture are both deployed in DRAM.In order to ensure good write performance,the memory table often takes up a lot of space.This paper designs a two-level cache for DRAM and persistent memory.Lower PM expands the capacity of the cache,and uses the persistence of PM to achieve a hot start of the database cache.This paper also proposes a caching algorithm based on a hybrid strategy for database load.When caching is eliminated,the caching value of cached pages can be considered more comprehensively.The contributions of this paper are summarized as follows:1.Analysis of caching problems of database load.A lot of experimental tests have been conducted on the two query loads in the database(record table query and index table query)under different cache implementations.There are two types of tables in the cache with a large difference in data density,and a large difference in the cache hit rate of the two queries,and different cache implementations have a greater impact on database throughput,and different cache page granularity has a greater impact on cache hit rate.2.High-performance hybrid cache structure for DRAM/PM.A high-performance cache structure for DRAM and PM is proposed.The low-cost PM is used to expand the cache,and a lock-free algorithm is used for code implementation.The cache not only has a larger storage space,but also has high concurrent read and write capabilities,and also uses the persistence of PM to achieve a hot start of the database cache.3.Cache elimination strategy.According to the characteristics of the two query loads and the difference in the density of the two data tables,a elimination algorithm based on a hybrid strategy is designed.When the cache is eliminated,the data access time,access frequency and data density can be considered at the same time.This paper implements the two-level cache in Ti KV,and compares it with Ti KV's native chain cache in terms of hit rate and throughput through a lot of experiments.The results proved that the two-level cache can provide higher hit rate and throughput.
Keywords/Search Tags:LSM-Tree storage architecture, non-volatile memory, database load, high-performance cache
PDF Full Text Request
Related items