Font Size: a A A

Design Of Piggybacking Framework In Distributed Storage System

Posted on:2022-10-30Degree:MasterType:Thesis
Country:ChinaCandidate:L ZhangFull Text:PDF
GTID:2518306605467274Subject:Communication and Information System
Abstract/Summary:PDF Full Text Request
In order to improve the reliability and availability of data,the fault-tolerant technology is introduced in the distributed storage system to ensure that the data can be accessed and used normally after some nodes fail.The traditional erasure coding has higher storage efficiency and higher fault tolerance than replication.However,when repairing failed or unavailable nodes,the amount of data reading and downloading increases significantly,which brings a huge burden to the system's I/O resources and network bandwidth.The piggybacking framework operates on multiple substripes of a basic code.By adding data on one substripe to another,the basic code decoding operation is reduced,thereby reducing the repair bandwidth of the failed node.It can meet all the requirements of the data centers for the maximum-distance-separable(MDS),a small number of substrips and the high code rate.This thesis focuses on the piggybacking design in the distributed storage system,and analyzes the structure of the piggybacking design from the aspects of repair bandwidth,repair locality and computational complexity.The main contributions are as follows:Firstly,this thesis briefly summarizes the challenges faced by distributed storage systems,points out the shortcomings of existing fault-tolerant technologies,and analyzes the advantages of piggybacking design in detail.By analyzing the advantages and disadvantages of the existing piggybacking design,the common methods of piggybacking design are summarized,which can lay the foundation for the further research.Secondly,some improvements have been made in a piggybacking framework for repairing information nodes.By re-deriving the average repair bandwidth formula of the information node,a more general lower bound is obtained.In order to reduce the repair bandwidth of the parity node,two methods for designing piggyback function are proposed through overall embedding.The first method is simple and easy to implement,but is applicable to some coding parameters.The second method has more complex steps,but can be applied to almost all coding parameters.By applying these methods to the piggybacking design,the improved piggybacking design reduces the repair bandwidth of parity nodes while retaining the repair bandwidth of systematic nodes.Thirdly,in view of the insufficient research on repair locality of existing piggybacking designs,this thesis proposes a novel cloned piggybacking design.The design of the scheme is simple.By grouping the nodes,the piggybacking and repairing methods are carried out on one set of nodes.Then the corresponding nodes in other groups are piggybacked in the same way.The trade-off between repair bandwidth and repair locality is established by “one parity node only piggybacks symbols from one substripe”.Then,the repair bandwidth of information nodes is further reduced by supplementary embedding.The maximum average repair bandwidth ratio approaches 0.0625 as the number of parity nodes tends to infinity.Finally,summarize the existing piggybacking designs.Combining the two designs proposed in this thesis,the performance of existing designs is compared from three aspects: repair bandwidth,computational complexity and repair locality.Although the cloned piggybacking design can only effectively repair the information nodes,when the number of parity nodes is greater than 5,the average repair bandwidth ratio of information nodes is the lowest.Compared with other designs,the cloned piggybacking design has better trade-off between the repair locality and repair bandwidth.The improved piggybacking design in Chapter 3with low computational complexity has MDS properties and can effectively repair all nodes.When the number of information nodes is large,the repair bandwidth ratio is low.Therefore,compared with other designs,the improved design has better comprehensive repair properties and is more suitable for large-scale distributed storage systems.
Keywords/Search Tags:Distributed storage, Piggybacking design, MDS code, Node repair, Repair bandwidth
PDF Full Text Request
Related items