Font Size: a A A

Organization Of Data And Cache In Multi-node Fault-tolerant Storage Systems

Posted on:2011-09-06Degree:DoctorType:Dissertation
Country:ChinaCandidate:Y L WangFull Text:PDF
GTID:1488303317976179Subject:Signal and Information Processing
Abstract/Summary:PDF Full Text Request
With the explosive growth of information resources and the applications of cloud computing model, storage capacity, data availability and I/O performance have been a great challenge to storage systems. Academia and industry have been pursuing the goal of constructing the storage systems with large capacity, high performance and high reliability. Modern storage systems are often composed of hundreds of storage nodes. The probability of multiple failures increases greatly because of several storage nodes fail simultaneously. This will lead to disastrous results because of the occurrence of data loss. The key problems in mass storage systems are how to design the scheme of data fault-tolerance and how to improve I/O performance in the multi-node fault tolerant storage systems.In this thesis, the goal is to improve the reliability and I/O performance of mass storage systems. The data fault tolerant problem and the I/O performance problem in mass storage systems are studied innovatively. The main contents and innovative results are as follows:1. To address the data fault-tolerant problem of two-node fault-tolerant storage systems, an efficient data fault-tolerant scheme called HRC is presented in this thesis. HRC is a new class MDS array code with minimum column distance 3. It satisfies the RAID-6 specification and has the optimal encoding and decoding complexity. We define HRC in theory, and prove the correctness that it corrects two storage node failures and its MDS property. A quick decoding algorithm is presented to correct two column failures. HRC has simple encoding algorithm and decoding algorithm, it can be realized easily in hardware or software. Compared with the existing RAID-6 schemes such as EVENODD and RS, HRC has lower computing complexity and can more easily be implemented in software or hardware.2. To address the data fault-tolerant problem of triple-node fault-tolerant storage systems, two efficient data fault-tolerant schemes respectively called FSC and TSC are presented in this thesis by extending HRC. FSC and TSC increase one-column parity with the slope of -2 and 1 separately. TSC has faster decoding speed than TSC by optimizing the third parity column. We prove the correctness that FSC and TSC correct three column failures and present the decoding algorithms including all failure modes. FSC and TSC are MDS array codes with minimum column distance 4, and have the optimal encoding complexity and lower decoding complexity. Compared with G-EVENODD (r=3) and STAR,FSC and TSC have faster encoding and decoding speed.3. To address the data fault-tolerant problem of multi-node (?4) fault-tolerant storage systems, an efficient data fault-tolerant scheme called VRC is presented. VRC is a class of vertical array codes and is defined in graph theory. VRC has several significant advantages such as high fault tolerance, the optimal update cost, load balance, less parameter limits, high adaptability and so on. We describe the encoding idea and construction methods in detail, present a decoding algorithm based on solving linear equations and have proved that VRC(n,p,q) can correct q column failures. Compared with existing comparable codes such as RS and LDPC, VRC has simpler encoding and decoding rules and lower computing complexity.4. To address the cache structure design problem of multi-node fault tolerance storage systems, a new cache structure called M-Cache with diversified storage media is presented based on DRAM and SSD, and an efficient cache organization and management method is also presented in this thesis. In the cache structure,the data is divided into the host read data and the host write data according to the host access model. The DRAM memory stores both the host read data and the host writing data. The SSD memory stores only the host writing data. The host writing data is written to the SSD memory at the same time that the data is written to the DRAM memory. The cache structure makes full use of the fast random access characteristics of the DRAM memory, and the long-term ability of storing data in the SSD memory. The structure improves the I/O performance and the data reliability.5. To address the cache replacement problem of multi-node fault tolerant storage systems, a write-prior cache replacement algorithm called WP-LRU (Write Prior LRU) is presented. WP-LRU is based on the fact that it takes higher costs to replace the host writing data than to replace the host read data. Different management algorithms are used to two kinds of data. The LRU algorithm is used to the host read data and the aggregated writing algorithm is used to the host write data. WP-LRU has good scalability. It can reduce the average service time and improve the overall system performance at the same hit ratio.
Keywords/Search Tags:Redundant Array of Inexpensive Disk (RAID), Array Codes, Maximum Distance Separated Codes (MDS), Cache Replacement, Multi-node Fault-tolerance
PDF Full Text Request
Related items