Font Size: a A A

Research On Local Redundant Array Code In Distributed Storage System

Posted on:2020-11-09Degree:MasterType:Thesis
Country:ChinaCandidate:F XiaoFull Text:PDF
GTID:2438330620455604Subject:Software engineering
Abstract/Summary:PDF Full Text Request
Erasure Code is a kind of coding algorithm that originating from channel coding theorem.It was initially used to solve the problem of error detection and correction in data transmission and then applied to distributed storage system.Erasure Code in storage system is mainly to encode the original data through a certain algorithm to get redundant data,and then use the redundant data to achieve data fault tolerance.Array Code is widely used as a kind of Erasure Code with simple construction and fast encoding and decoding operations.Among all the Array Codes,the research and application of EVENOOD code and RDP code are the most mature.However,there are two common problems in traditional array codes.The first is that the cost of single disk recovery is too expensive,leading to the low recovery efficiency of single disk failure.For a distributed storage system,the long recovery time of a single disk will increase the probability of multiple disk failures.And the bandwidth between nodes is one of the bottlenecks of distributed storage system.Therefore,the time of data recovery and the reading cost during the recovery process become the key to the stability of the system.In addition,the fault tolerance of array codes is limited.For EVENODD and RDP codes,there are at most two errors.In view of this situation,based on the research of common Array Codes construction and optimization methods,this paper proposes algorithms for local redundancy modification of Array Codes,which includes the following contents.(1)Research on the common Array Codes encoding and decoding algorithms,including EVENODD,RDP,STAR,RTP,X and so on.And the characteristics of their encoding construction and decoding algorithm are analyzed to obtain their performance advantages and limitations.Further more,the common Array Codes optimization schemes are studied.(2)A method of adding redundant columns is proposed to reduce the overhead of data reading and increase the fault-tolerant performance of single disk failure.EVENODD and RDP are taken as examples to improve and optimize the old algorithms.The method is locally optimized in horizontal direction and diagonal direction respectively.First,we verify that the coding efficiency is consistent with the traditional EVENODD and RDP.Secondly,it is proved theoretically that it can reduce the data reading overhead in the case of single disk and double disk failure.Finally,by enumerating the triple disk failure,it is concluded that the optimized code can recover 75% of the triple disk failure.The new encoding and decoding methods of EVENODD and RDP with local redundancy columns are experimentally tested.A performance test platform for Erasure Codes is built based on HDFS file system.With the help of this platform,the encoding and decoding simulation experiments of the modified codes are carried out.The experimental results show that the recovery efficiency of the improved codes is significantly improved compared with the traditional EVENODD and RDP.
Keywords/Search Tags:Data Storage, Erasure Codes, Array Codes, Single Disk Failure
PDF Full Text Request
Related items