Font Size: a A A

Research On Prefetching And Caching Technologies Of Parallel Disks System

Posted on:2013-04-03Degree:DoctorType:Dissertation
Country:ChinaCandidate:X D ShiFull Text:PDF
GTID:1228330392455428Subject:Computer system architecture
Abstract/Summary:PDF Full Text Request
As the development of network and computer technologies and the emergence oflarge scale data-intensive applications, parallel storage systems based on RAID becomemore and more important and are widely used. However, the performance improvementsin disks lag far behind the speed progress of electron devices, such as processor. Thus, thestorage system is still the performance bottleneck of the whole system. Prefeching andcaching are two key technologies that can effectively improve the performance of storagesystems. However, the conventional prefetching and caching don’t work well underparallel storage systems because they are typically designed for single storage device. Theparallel storage systems are totally different from single storage device, and henceexploiting and integrating their characteristics into the prefetching and caching for furtherperformance improvement is a very meaningful topic.The sequentiality of disk is one of the most important access patterns. With respect todisks (including the disks in RAID), the performance of accesses to sequentially placedblocks can be an order of magnitude higher than that of accesses to randomly placedblocks. The importance of sequentiality attracts amount of studies to focus on exploitingor improving the sequentiality in accesses. However, these above technologies have toface the chanllenges that how to obtain the length information of sequential streams. Tosolve this problem, we propose a Weighted-Graph-Based model, which can be used toexploit and predict the length information. This model can not only capture all theanomaly details of length distribution under specific workload, but also can adapt itself towork well under as many workloads as possible.The parallel disks systems charactered by their architecture typically work within thelow-level storage center,where their caches are configured as second-level buffer cache.The temporal locality in second-level buffer cache is weak because the accesses tosecond-level cache are the misses passed from first-level cache. To make matters worse,the modern storage systems have similar cache size to their hosts, which leads the reusedistance of pages in second-level buffer cache to have little difference from their lifetime(the time during which the pages is conserved in buffer cache). As a result, whenever thelifetime is prematurely reduced (such as prefetching), the corresponding pagea are very likely to lose their re-referenced opportunities. However, this problem is still unsolved inthe traditional prefetching schemes. Accroding to the aforementioneded problems, wepropose an adaptive sequential prefetching for second-level buffer cache, named ASEP.When the prefetching depth needs to be determined, ASEP not only considers thesequentiality exploitation, but also considers the influence to buffer cache, which includesthe influence to both the replaced data and the data losing the re-referenced opportunities.Therefore, the hit ratio of second-level buffer cache is significanted improved.For a prefetching request under highly concurrent disk array, the disk independencyis more important than its parallelism, where the former can significantly reduce the I/Ocost. Therefore, we propose a locality-aware strip prefetching (LSP). Strip prefetchingwith less read cost fetches the whole strip following the user access to part of the strip,where the positioning time of disk head is saved. More importantly, LSP integrates theuser access pattern into the strip prefetching. According to the accesses, LSP detectes theprefetching data areas and performs the prefetching only on those strips located in theareas. Due to the spatial locality of prefetching data areas, this algorithm can efficientlyutilize the disk bandwidth and improve the buffer cache hit ratio.The most important goal in traditional cache management scheme design is tominimize the number of cache misses. Unfortunately, this may be a misleading metirc.The cost of requesting ten parallel blocks can be less than that of requesting three serialblocks, which is also true for sequential blocks and random blocks. To solve this problem,we propose a paralellism based cache management scheme (PCAR) to work under paralleldisk systems, which can exploit both the inter-disks parallelism and the intra-disksequentiality. To achieve the above goals, PCAR evenly selects the victim data located oneach disk and firstly evicts the strips with more blocks. The experimental results show thatthe access stream shaped by PCAR buffer cache is re-organized and optimized, wheremore parallel and sequential requests are included.
Keywords/Search Tags:Parallel Disks System, Sequential Prefetching, Sequential Stream Prediction, Second-Level Buffer Cache, Strip Prefetching, Parallel Prefetching
PDF Full Text Request
Related items