Font Size: a A A

Research On Caching And Prefetching Technologies For Hierarchical Hybrid Storage Systems

Posted on:2014-04-18Degree:DoctorType:Dissertation
Country:ChinaCandidate:Y LiuFull Text:PDF
GTID:1228330425473275Subject:Computer system architecture
Abstract/Summary:PDF Full Text Request
With the widespread use of the SSD(Solid State Disk), the hybrid storage system that composed by SSD and HDD (Hard Disk Drive) has been a hot area of research. The hybrid storage system integrates the the advantages of SSD and HDD. such as high performance, big capacity and low cost. There are mainly three kinds of architectures for hybrid storage systems:(1) SSD serving as both read cache and write buffer for the disk layer to accelerate the I/O speed;(2) Disk serving as a write buffer for the SSD layer to alleviate the wear speed;(3) SSD and disk are both serving as persistent storage device and optimize the system performance by data migration and remapping. The first kind of hybrid storage architecture is very popular since it provides high IOPS, low cost and transparent deployment, while only needs to add a samll capacity SSD for a large disk volume.SSD’s lifetime is related to the P/E times it performs. As a disk cache, the lifespan of SSD is shorten dramatically when the workload keeps changing its popular block set dynamically. Meanwhile, the big capacity of SSD cache will destroy the locality of the underlying storage system, especially the systems that are sensitive to the access locality such as disks and RAID systems. To settle the problems caused by the SSD Cache, improving the caching and prefetching technologies so as to optimize the hybrid storage system has become a hot topic for both industry and academia.Firstly, we propose a hierarchical hybrid storage system RAF (Random Access First) which is made of a SSD cache and a RAID system. Using a flash cache oriented cost-benefit model, this architecture supports a random access data first cache insertion strategy. RAF is a hybrid storage architecture that optimized for flash cache. By leaving the sequential access data to the disk layer, SSD cache only keeps the highly random data which brings the highest benefits. By doing so, the SSD’s lifespan is extended and the response time is accelerated. Besides, RAF has an improved garbage collection efficiency since the flash media area is divided into two areas, e.g. read area and write area. In this way, the invalid pages that need to be claimed by garbage collection module will gather in the write area. Experiment results show that compared with the FlashCache policy with the same hardware configuration, RAF improves the average response time by17%and the wear rate by53%.Secondly, main memory, SSD cache and disks have naturally composed a multi-level cache architecture. To settle the excessive wear of the SSD in multi-level cache architecture, we propose a bypassing algorithm called CHPA (Characteristics between Hierarchies byPassing cache Algorithm) for the SSD cache. CHPA is an asynchronous multi-level caching algorithm that aims to reduce the write amount of the SSD cache. The main idea is to predict the hot blocks by counting the hit frequency in the DRAM cache and the trips between DRAM cache and SSD cache. When the hot blocks are evicted from the upper DRAM cache, the hot blocks between them will be inserted into the SSD cache while the cold blocks will be bypassed in order to reduce the expensive write overheads of SSD cache.Thirdly, we propose a SSD oriented sequential prefetching policy called FLAP (FLash-Aware Prefetching), where a novel structure named relationship graph is incorporated. FLAP is an aggressive prefetching policy with high accuracy. With a quantitative analysis model and the adequate cache space within SSD, FLAP performs accurate and aggressive prefetching on cache misses in order to save the prefetching cost. Additionly, by separating a prefetching area from the SSD caching space and using a time-aware allocation scheme, the garbage collection efficiency in the prefetch area is optimized, thus alleviating the wear due to prefetching.Finally, since the upper cache tiers composed by DRAM and the SSD have filtered most time locality of the I/O accesses, prefetching becomes more import for the bottom storage layer composed by disks (RAID systems). Henceforth, we propose a strip-oriented aysnchronous prefetching scheme called SoAP to adapt to the internal characteristics of the RAID system. SoAP is dedicated to the parallel storage system. It breaks a prefetching request into multiple sub-requests, each of which contains a complete strip, so as to solve the problem of sequentiality loss. In addition, with the multi-queue and asynchronous scheduling schemes, SoAP performs prefetching operations by idle disk bandwidth, thus saving the prefetching cost.
Keywords/Search Tags:Hybrid Storage System, Cache, Prefetch, Sequential Stream, GarbageCollection
PDF Full Text Request
Related items