Font Size: a A A

Research On Data Layout Technologies For Disk Arrays

Posted on:2011-07-23Degree:DoctorType:Dissertation
Country:ChinaCandidate:B MaoFull Text:PDF
GTID:1118360305492370Subject:Computer system architecture
Abstract/Summary:PDF Full Text Request
RAID (Redundant Array of Independent Disks) is an important technique to build high performance, high reliable and large capacity storage systems. With the fast development of the disk technology and the new storage media, such as the flash based Solid State Disk (SSD), the traditional disk array technology is facing many challenges. Disk capacity in-creases at the rate of 60% per year, while the performance parameters of disks, such as seek time and rotational delays, increase at the rate of about only 10%. Recently, researchers found that in large-scale data centers, the disk failure rate is far beyond that of disk manufac-turer guaranteed. The uneven development of the disk capacity, performance and reliability indicates that the traditional disk array data layout is unable to meet the performance and re-liability requirements simultaneously. Moreover, the end users require the storage subsystem much more energy efficient, which is not considered by the traditional disk arrays. Mean-while, with the increasing development of the SSDs, SSD-based array is also facing some challenges. Therefore, optimizing the data layout of traditional disk arrays to improve the performance, reliability and energy efficiency of storage subsystem is a significant and urgent task.Because of the dual-copy characteristics of mirrored redundant disk arrays (RAID1/10), completing a write request needs both of the two mirroring disks completing writing the data, so the write performance is rather poor. Moreover, the dual redundant features also make the mirrored redundant disk arrays consume larger energy in the normal state. Based on the data layout of the traditional mirrored redundant disk array, two new data layout methods for improving performance (RAID10L) and power consumption (GRAID) are proposed. They extend the data layout method of RAID10 by incorporating a dedicated log disk. For every write request, RAID10L keeps two copies of the write data: one in its normal place of the data disk chosen by a write balance scheme and the other in the log disk sequentially. The update to another data disk in the same mirror set is delayed to the next idle period. The performance evaluations of RAID10L show that it outperforms RAIDO and RAID10 by 27.3%-47.1% in terms of the average response time. GRAID stores all write data since the last mirror-disk update in a log disk and then periodically updates it to the mirroring disks, thus being able to spin down all the mirroring disks (or half of the total disks) to a lower power mode at most time to save the energy without sacrificing reliability. The performance evaluations show that the energy efficiency of GRAID is significantly better than that of RAID 10 by up to 32.1% and an average of 25.4%. Moreover, a new disk array architecture, MP-RAID, which combines both mirroring and parity techniques to further improve the reliability of disk arrays, is proposed. MP-RAID can dynamically change its data layout between mirroring redundancy and parity redundancy to meet the energy efficiency and performance requirements.Due to the characteristics of the flash media, the flash-based SSD has low write per-formance and limited erase counts. Straightforwardly applying RAID algorithms to SSDs is challenging. To fully explore the performance advantages of SSD and HDD, a hybrid parity-based disk array (HPDA) is proposed. In HPDA, the SSDs (data disks) and part of one HDD (parity disk) compose a RAID4 disk array. Meanwhile, a second HDD and the free space of the parity disk are mirrored to form a RAIDl-style write buffer that temporarily absorbs the small write requests and acts as a surrogate set during recovery when a disk fails. The write data is reclaimed back to the data disks during the lightly loaded or idle periods of the system. Thus, the performance and reliability of SSD-based disk array are improved significantly.The studies have deeply insight into the data layout for improving the performance, reliability and energy efficiency of the disk arrays, and provided some innovative methods. Moreover, the studies explored and investigated applying the RAID technique into SSD-based disk arrays, making better basis for the next-generation large-scale storage systems.
Keywords/Search Tags:Redundant Array of Independent Disks, Solid State Disk, Data Layout, Power Consumption, Performance Evaluation
PDF Full Text Request
Related items