Font Size: a A A

Array Codes With Local Repair Property In Distributed Storage System

Posted on:2022-04-01Degree:MasterType:Thesis
Country:ChinaCandidate:Y WuFull Text:PDF
GTID:2518306539961289Subject:Electronics and Communications Engineering
Abstract/Summary:PDF Full Text Request
With the exponential growth of the total amount of data in today's society,massive amounts of data need to be stored reliably.Traditional data storage methods have become powerless for the storage of huge amounts of data.Under such circumstances,distributed storage systems have emerged and are easy to expand.The advantages of low cost make it gradually become an excellent choice for massive data storage.A distributed storage system is composed of many nodes.With the expansion of the system scale,the number of nodes continues to increase,and node failures occur frequently.Therefore,fault-tolerant technology is needed to improve the reliability of data storage.Fault-tolerant technologies mainly include copy technology and erasure coding technology.Copy technology is to back up data for a certain number of times and requires a lot of storage space,while erasure coding technology can greatly reduce the required storage space while ensuring data reliability.Array codes are a type of erasure codes with simple encoding and decoding operations and easy implementation.Most of the array codes currently designed can tolerate two or three node failures,which is obviously not enough from the perspective of data reliability.This thesis first proposes an encoding method for array code that tolerates multiple column failures,and designs an effective decoding algorithm based on the LU decomposition algorithm when one or more columns fail,and then analyzes the computational complexity.Finally,the proposed code is tested in a distributed cluster environment.The main contributions of this paper are as follows:(1)In this paper,an MDS array code constructed over a specific quotient ring with multiple parity columns is called Generalized Expanded-Blaum-Roth(GEBR)code,which has flexible coding parameters.The proposed codes can not only tolerant multiple columns failures,but also tolerant any entry failure in a column.We propose an efficient decoding method based on LU algorithm for multiple column failures,and compare the existing Expanded-Blaum-Roth(EBR)code and our GEBR code in terms of computational complexity.(2)Based on the theoretical research of the relay model combined with erasure codes,an extended HDFS distributed storage system is designed.Under the construction of different coding parameters,performance comparison tests are carried out from the aspects of encoding and decoding rate,number of failed nodes and location of failed nodes.Experimental results show that when multiple nodes fail,the encoding/decoding rate of GEBR code is significantly higher than that of EBR code,and as the total number of nodes increases,the encoding/decoding efficiency of GEBR code is about 77% higher than that of EBR code.
Keywords/Search Tags:distributed storage system, erasure code, MDS, array code, EBR code
PDF Full Text Request
Related items