| Key-Value storage system based on The Log-Structured Merge-tree(LSM-tree)has been widely used in various data-intensive applications because of its high writing performance.However,LSM-tree has a serious read/write amplification problem when facing the hotspot tilted workload because it cannot perceive the hotspot characteristics.In order to solve this problem,most of the previous studies adopted the read/write and compression mechanisms based on qualitative heat model.Due to the single heat factor considered and no heat difference between data,it was impossible to accurately identify hot data.In addition,these efforts do not address the long tail delay of reading non-hot data.Therefore,the main research contents and contributions of this dissertation include:(1)Aiming at the problem that hot data cannot be accurately identified in previous research work,this dissertation designs a multidimensional heat statistical model.The model calculates access information and data heat quantitatively from multiple dimensions such as access frequency and last access time,so as to accurately identify hot data.In addition,this dissertation proposes a fine-grained hotspot caching method to optimize the read and write performance of hotspot data by caching hotspot data,and combines the above statistical model to identify redundant hotspot copies and precache data blocks in the compression process to optimize the compression performance.The experimental results show that compared with other methods,the proposed method improves the throughput by 14.1%~55.6%,reduces the average delay by 12.4%~36.5%,and reduces the 95 percent of read delay by 11.7%~42.1%.(2)Aiming at the problem of long tail delay in reading non-hot data,this dissertation proposes a multi-level asynchronous parallel access technology based on sliding window.This technology changes the synchronous serial reading mode of LSM-tree into the asynchronous parallel reading mode,and controls the parallelism degree through sliding window to avoid hunger caused by preempting IO resources.In addition,by modeling the process of asynchronous parallel access,this dissertation formally proves the effectiveness of asynchronous parallel access technology,and deduces the quantitative calculation method of sliding window size.The experimental results show that the proposed method can effectively solve the long tail delay problem of reading non-hot data,and the 99 percent of reading delay is reduced by 37.8%~45.1% compared with other methods. |