Font Size: a A A

The Research And Application On Reed-Solomon Codes Based On Distributed Storage System

Posted on:2018-07-24Degree:MasterType:Thesis
Country:ChinaCandidate:L ZhouFull Text:PDF
GTID:2348330518459412Subject:Electronics and Communications Engineering
Abstract/Summary:PDF Full Text Request
With the popularity of computers,intelligent devices,and the rapid development of Internet technology.Various types of data are proliferating exponentially,bringing more challenges to the storage system.The data storage security has become an urgent problem in the current storage system.At present,distributed storage is an effective means of dealing with large volume data storage.There are two ways to secure data in a distributed storage system: multiple replication technology and erasure coding technology.Multiple replication technology is simple and easy to implement,just a simple multi-replication of a single piece of data and stored separately,the most common kind of multiple replication is 3-replication.In order to obtain more secure data security,multiple replication technology can only achieve it by increasing the number of replication,at the same time,the cost of storage is increasing very high.In order to solve the problem of too high storage cost of multiple replication technology,people apply the erasure coding technology which is used to slove the problem of data loss in the process of the communication system into the storage system,the erasure coding technology can solve the problem of high storage cost,and gurantee the same or even higher security capabilities like multiple replication technology.While solving the problem of high storage cost of multiple replication technology,there is a new problem: the system resources and number of I/Os consumed by performing the reconstruction of data loss have increased significantly.For this purpose,this paper starts with the Reed-Solomon codes,analyzes the coding equation and characteristic of Reed Solomon codes,and draws the advantages of array coding and LDPC code,and proposes LRC code which is improved on Reed-Solomon code.The definition of LRC code is given and the fault tolerance analysis of LRC code is also carried out.The Markov model is established to analyze its reliability.The comparison between the construction matrix of the coding equation and the coding parameters is also analyzed.In order to be able to apply the coding idea of LRC,the open source distributed storage system HDFS has been taken as the research platform,the system architecture and data placement strategy of HDFS storage system are analyzed and understood.Based on three aspects of the data placement strategy,data reconstruction process and communication check mechanism.A design idea to achieve LRC code in the HDFS has been put forward.Finally,the results of three sets of experiments show that LRC code is almost half of the cost of Reed-Solomon code at decoding time.When changing the coding parameters,the encoding and decoding performance of the LRC does not change significantly and can provide more parameter combination selection.In the coding matrix,the coding equation based on Cauchy matrix and the coding equation based on the Vandermonde matrix have similar performance.
Keywords/Search Tags:erasure coding, distributed storage, coding equations, data placement strategy
PDF Full Text Request
Related items