Font Size: a A A

The Research About The Encoding And Repair Mechanism Based On Erasure Code In The Distributed Storage System

Posted on:2017-11-20Degree:MasterType:Thesis
Country:ChinaCandidate:Y J FanFull Text:PDF
GTID:2348330503981905Subject:Electronics and Communications Engineering
Abstract/Summary:PDF Full Text Request
In the era of big data, storing and the management of mass data are important issues. At present, the traditional centralized data storage system has been unable to meet the growing storage requirements. Therefore, the scheme that using the cheap personal computer of network to storage data dispersedly —— distributed storage system being more and more widely used.Distributed storage system separately store the data in the nodes that are composed of cheap personal computers in the network. In order to ensure the reliability and stability of system, the system store a certain redundant data in the store nodes through some redundant data strategy. At present, there are two primary strategy of redundant data, which are the redundant strategy based on replication and the redundant strategy based on erasure codes. By contrast, the redundant strategy based on erasure codes are widely used in the distributed system because it can effectively reduce the cost of memory. Therefore, it is significant to research the redundant strategy based on erasure codes in the distributed storage system.This paper mainly studies the dilatation issue of data nodes and the data repair issue while nodes failure. The specific research work are as follows:1) The erasure code that based on cyclic shift xor operation in binary fields has the characteristics: low encoding-decoding complexity and low computational cost. But it is poor in scalability. Therefore, a distributed storage code based on cyclic shift and binary addition operation is proposed, named RCBC(Regenerating Code over Binary Cyclic Code) code. In order to ensure distributed storage system's reliability, 3 source blocks are mapped into 6 coded blocks which would be stored into 6 distributed storage nodes. It is applied to ensures RCBC with MDS(Maximal-distance Separable)(3,6) property: any 3 out of the 6 coded blocks can recover all the original information. By analyzing encoding/decoding complexity, reconstruction bandwidth and encoding rate,it turns out that the proposed RCBC code possess the following advantage: lowencoding/decoding complexity and small reconstruction bandwidth.2)(n, k) CP-ZD(Combination Properity Zigzag Decodable) code is a kind of erasure code that be able to extend to n nodes.CP-ZD code with low storage overhead and low computation complexity because it is based on xor operation in binary fields. But in the progress of repairing data, it will result in a larger bandwidth consumption. To solve this problem, we put forward the optimal bandwidth repair scheme. In the progress of data repair, we first solve the decoding matrix. According to the decoding matrix, we fetch the corresponding decoding packets and then recover the original packets by using ED(Equality Decoding) rule. The scheme can effectively guarantee the integrity of the data and reduce the bandwidth consumption in the process of data repair.
Keywords/Search Tags:Distributed Storage System, MDS, Binary Cyclic Code, Repair Bandwidth
PDF Full Text Request
Related items