Font Size: a A A

Study On Performance Optimization For SSD-based RAID Arrays

Posted on:2016-01-04Degree:DoctorType:Dissertation
Country:ChinaCandidate:Y B PanFull Text:PDF
GTID:1108330473961535Subject:Computer software and theory
Abstract/Summary:PDF Full Text Request
In the era of massive data, how to guarantee both the high reliability and high performance becomes a big challenge for storage systems, which will be well addressed by SSD-based RAID arrays. On the one hand, not only can a RAID array provide the I/O parallelism, it also guarantees the data reliability via storing the redundant information. On the other hand, with higher I/O performance, lower power consumption, and higher shock resistance, Solid-state drives (SSDs) will further improve the performance of RAID arrays. However, SSDs also possess multiple constraints, such as the limited erase operations, the time-variant reliability, and the cost of garbage collection operations, which may have a negative impact on both the performance and endurance of SSD-based RAID arrays. Therefore, it is important to study on SSDs and workloads in order to optimize the performance and endurance for SSD-based RAID arrays. Specifically, we design a new kind of coding method for improving the system-level wear-leveling in the coding layer, propose a workload-awareness scheme to reduce the writes and average response time in the file system layer, and study the data migration strategies for SSD-based RAID arrays in the application layer. Our main works and contributions are as follows:(1) DCS:Diagonal Coding Scheme for Enhancing the Endurance of SSD-based RAID ArraysTo guarantee high reliability, SSD-based storage systems require data redundancy schemes, e.g., RAID schemes. Traditional RAID-5, RAID-6, and Reed-Solomon codes can tolerate one, two and an arbitrary number of device failures, respectively. However, some SSDs under those redundant configurations may age much faster than others because of the high skewness and locality of workloads. The uneven aging rates may make some SSDs wear our very quickly and decrease the endurance of SSD-based RAID arrays.To address this problem, we first come up with a diagonal coding scheme (short for DCS) by distributing the updating dependencies evenly among devices to improve the system-level wear-leveling. DCS can efficiently improve the array endurance if requests are aligned with the stripe size, i.e., when data symbols in the same stripe receive the same number of writes, while the number could be different for different stripes. To relax the above assumption, we further propose an enhanced scheme, DCS+. With a buffer design, DCS+ can improve the wear-leveling among devices under general access patterns via triggering different responses to different kinds of requests. We conduct extensive evaluations based on real-world workloads with the well accepted simulator, DiskSim with SSD Extension, and results show that our design efficiently enhances the endurance and performance of SSD-based RAID arrays.(2) Grouping-based Elastic Striping with Hotness Awareness for Improving SSD RAID PerformanceConventional RAID arrays usually update parities with read-modify-write or read-reconstruct-write, which may introduce a lot of extra I/Os and thus significantly degrade SSD RAID performance. The recently proposed elastic striping scheme reconstructs new stripes with updated new data chunks without updating old parity chunks. This scheme indeed reduces the cost of parity updates. However, it necessitates RAID-level garbage collection which may incur a very high cost.To address this problem, we propose a hotness-aware caching scheme to buffer incoming writes and categorize data chunks in buffers into multiple groups according to their hotness values. We then propose a grouping-based elastic striping scheme to separately write data chunks in different groups into SSDs. To validate the effectiveness of our design, we deploy the proposed schemes on a RAID-5 array composed of eight commercial SSDs, and experimental results show that compared to elastic striping, our scheme reduces 26%-65% of chunk writes to SSDs, also reduces the average response time by 17.2%-63.9%, and helps achieve the system-level wear-leveling.(3)Rethinking Data Migration in SSD-based RAID ArraysDevice addition (also called RAID scaling) is often carried out in modern RAID arrays to meet the ever increasing demand of storage capacity and I/O performance. After new devices are mounted, data migration will be triggered. Traditional data migration strategies are designed for HDD-based RAID arrays, while SSD-based RAID arrays require data migration strategies to be redesigned. Each SSD may suffer the different frequency of garbage collection and provide different I/O ability because of the skewness and locality of workloads When an SSD-based RAID array triggers data migration. Therefore, it is important to design new data migration schemes for SSD-based RAID arrays in a heterogeneous environment.To achieve this, we propose the FastMigration scheme and the FastAccess scheme for the heterogeneous SSD-based RAID arrays. The former one aims to quickly complete the process of migration via migrating more data from the SSDs with better performance, while the second one aims to better I/O performance after migration via migrating more data from the SSDs with poorer performance. We then conduct extensive evaluations via DiskSim with SSD Extension and validate the effectiveness of the proposed migration schemes. Finally, we compare the advantages and disadvantages of the two data migration schemes and discuss the technical challenges of deployment in real systems. With these critical findings, we give a set of recommendations and guidelines to system designers for deploying data migration schemes in SSD-based RAID arrays.
Keywords/Search Tags:SSD, RAID, wear-leveling, workload-awareness, data migration
PDF Full Text Request
Related items