Font Size: a A A

Research And Application Of Binary Array Code In Distributed Storage System

Posted on:2021-04-06Degree:MasterType:Thesis
Country:ChinaCandidate:X S YuFull Text:PDF
GTID:2518306470960939Subject:Electronics and Communications Engineering
Abstract/Summary:PDF Full Text Request
The total amount of data is increasing dramatically in nowadays.Traditional data storage solutions are powerless in the face of massive data storage,and distributed storage systems have advantages such as easy expansion and low cost.It has gradually become an excellent choice for storing massive amounts of data.Distributed storage system consists of many cheap and unreliable nodes.As the system scale continues to expand and the number of nodes continues to increase,the failure of storage nodes in the system will become the norm.This requires the use of fault tole rance technology to improve storage reliability.The fault-tolerant technology mainly includes replication and erasure code.Replication is to back up data multiple times.Although it can effectively guarantee data reliability,it will waste a lot of storage space.Erasure code technology can greatly improve the storage space utilization rate under the premise of ensuring data reliability.Array codes are an important part of erasure codes,and their encoding and decoding operations are simple and easy to implement.At present,the distributed storage system designed based on array codes has low fault tolerance,which is obviously not enough to ensure data reliability.Therefore,array codes that can tolerate the failure of multiple nodes are very attractive for distributed storage systems.This paper first studies the decoding problem of multi-column failures of array codes,secondly designs a distributed storage system with high fault tolerance based on array codes that tolerate multi-column failures,and finally performs test analysis based on the actual distributed cluster environment,the main work and innovations are as follows:(1)A Vandermonde array code with multiple parity columns,named as NVA code(New Vandermonde Array Code)is introduced,and its single-and two-column failures decoding methods are briefly introduced.When multiple columns of NVA code fail,according to its structural characteristics,we have proposed a decoding scheme for the case where the number of consecutively available parity columns is not less than the number of failed information columns,which can recover the failed columns.And we compare the Vandermonde array codes with Cauchy array codes from the aspects of encoding and decoding methods,computational complexity and so on.(2)Based on the relay model,a highly fault-tolerant distributed storage system,named as BDFS(Binary array coding Distributed File System),is designed.The BDFS storage system integrates Vandermonde array codes and Cauchy array codes fault tolerance mechanism,and has basic functions such as file encoding,uploading,downloading and decoding.The BDFS system can tolerate multiple node failures and provides a good combination for erasure coding applied to distributed storage systems.(3)Vandermonde array codes and Cauchy array codes are deployed in actual distributed cluster,and we implement a coding storage system test framework that simulates the codec situation when multiple nodes failures,and we test different array codes in terms of encoding rate,decoding rate.The experimental results show that under the condition of providing the same fault tolerance,the encoding rate of Vandermonde array codes is about 58% higher than that of Cauchy array codes,and the decoding rate of Vandermonde array codes is about 70% higher than that of Cauchy array codes.
Keywords/Search Tags:distributed storage system, Vandermonde array codes, Cauchy array codes, fault tolerance
PDF Full Text Request
Related items