Font Size: a A A

Research On Server-end Cache Optimization For Cloud Block Storage Systems

Posted on:2022-03-06Degree:DoctorType:Dissertation
Country:ChinaCandidate:Y ZhangFull Text:PDF
GTID:1488306572476294Subject:Computer system architecture
Abstract/Summary:PDF Full Text Request
As an emerging general block-level storage service,cloud block storage(CBS)has the advantages of high performance,high scalability,high reliability and generality,received broad attention from the industry.Nowadays,the promising CBS has been widely deployed by major cloud providers to sustain basic data storage in various fields.Improving the performance of the CBS cache is an effective means to enhance its I/O performance.However,the optimization of cloud block cache faces many challenges.First,with thousands of applications deployed in CBS,the complexity and dynamics of I/O workloads make it difficult to guarantee the effectiveness of current algorithms.Second,in large-scale CBS,a great amount of log information is generated from time to time,bringing unacceptable transmission and processing overhead to existing analysis methods.Finally,the architectural characteristics of CBS and the popularity of emerging cache mediums have put forward new requirements for cache optimization.How to design appropriate cache policies for the complex and changing cloud I/O workloads and the CBS architecture is the focus of current research.To address the above issues,this thesis optimizes CBS cache from three dimensions:cache allocation policy,cache replacement policy and cache write policy.Specifically,the main contributions are summarized as follows.To solve the problems of complex processing,low accuracy,and high additional overhead in the current widely used cache allocation techniques,we propose an Onlinemodel based Scheme for Cache Allocation.OSCA can search for a near-optimal configuration at very low complexity to improve the overall cache efficiency.First,we propose a re-access ratio based cache model,namely RAR-CM,which calculates the data reuse distance at O(1)complexity.Based on the reuse distance distribution,the miss ratio curve of each storage node in CBS can be obtained.Second,the total cache accesses is defined as the optimization objective.Finally,a near-optimal solution is searched leveraging a dynamic programming approach,and the reallocation of cache space is executed based on this solution.Experimental results with real workloads show that our cache model achieves lower mean absolute error compared with state-of-the-art techniques.Due to the improved overall hit rate of the shared cache,OSCA reduces I/O traffic to the back-end storage servers by 13.2% relative to an equal-allocation-to-all-instances policy.To alleviate the cache inefficiency caused by the large-reuse-distance-phenomenon at the storage layer in cloud block storage systems,we propose a Lazy Eviction cache Algorithm.When a cache miss occurs and the cache is full,LEA will compare the worth of the currently-accessed block and the victim cache block comprehensively based on the lazy conditions to decide whether to execute a replacement,significantly increasing the residence time of the truly valuable cached blocks and effectively improving the performance of the cache.The lazy conditions are controlled by two adjustable lazy parameters(K and PARA,respectively),based on which the residence time of cache blocks can be regulated appropriately to adapt to the changing workloads.Simulation experimental results show that LEA not only outperforms the state-of-the-art cache algorithms in hit ratio but also significantly reduces the number of writes to the SSD cache.To alleviate the problem that current cache write policies cause too many useless writes to the SSD cache,we propose a Machine Learning based cache Write Policy ML-WP.Specifically,based on the statistics of I/O traces from Tencent CBS and the analysis of the CBS architecture,ML-WP reduces write traffic to the SSD cache by avoiding writing writeonly data to the cache.To accurately identify the write-only data,a machine learning approach is introduced to classify data blocks into two categories: write-only data and normal data,respectively.To improve the accuracy of our algorithm,the classification is based on the request-level features of data blocks.Based on the classification results,the write-only data will not be cached,but persisted directly to the back-end storage.The experimental results show that ML-WP reduces write traffic to the SSD cache by 41.52%,while improving the hit rate by 2.61% and reducing the average read latency by 37.52%compared to the widely deployed write-back strategy.
Keywords/Search Tags:Cloud Storage, Block Storage, Cache Policy, Cache Allocation, Cache Model
PDF Full Text Request
Related items