| As the basic platform for processing and storing data,the reliability of storage systems is crucial to ensure data service capability.At present,the storage system mainly uses replication or erasure coding to ensure the reliability of the system.Replication prevents data loss by duplicating the stored data.This method incurs significant storage overhead.Erasure coding can ensure high reliability with less storage overhead,but introduces additional encoding and decoding,which magnifies the computation and data transmission overhead of storage systems.ZD codes(Zigzag-Decodable codes)are a type of erasure codes based on the Zigzag decoding algorithm.It encodes and decodes data through XOR operations with offsets to ensure data fault tolerance.The required computation overhead is much lower than that of other commonly used erasure coding techniques.However,due to its encoding method,ZD codes will incurs more storage overhead due to the parity data.Moreover,during the decoding process,the data transmission volume is large,and the cache is underutilized.This thesis focuses on the encoding and decoding process of ZD codes and then proposes solutions to improve the performance and efficiency of the ZD codes in storage systems.The contributions are summarized as follows.(1)A new zigzag decoding algorithm that makes full use of parity information.The existing decoding algorithm of ZD codes does not fully utilize the parity information,resulting in the inability of the decoding algorithm to support ZD codes data recovery in some encoding situations.At the same time,the decoding algorithm needs to continuously search for the data involved in decoding in multiple blocks during the recovery,which degrades the computation performance of ZD codes.This thesis proposes a decoding selection algorithm for ZD codes and a corresponding new zigzag decoding algorithm to address these shortcomings.The new selection algorithm and decoding algorithm support decoding from the last bits of the ZD codes,thereby improving the information utilization rate of the decoding process,relaxing the conditions required for ZD codes to ensure fault tolerance,and providing a basis for designing ZD codes with less storage overhead.The selection algorithm extracts the decoding information of ZD codes,avoids searching during the decoding process,and reduces the computation overhead and the memory overhead of data repair.At the same time,based on this algorithm,this thesis further proposes optimization schemes for the buffer usage and data transmission of the ZD codes decoding process.(2)A new coding scheme for ZD codes with an optimal storage cost.To ensure fault tolerance,ZD codes enlarge the amount of data stored in the parity blocks,resulting in extra storage overhead.Therefore,this thesis designs a new coding scheme for ZD codes.This thesis proposes a property called the different-difference property,and the relationship between this property and the fault-tolerant performance of ZD codes is proved.Based on the property,we propose a new encoding scheme for ZD codes with the lowest storage cost that can be generated under the condition of parameter m≤3,which is called an OS-ZD code.We test and analyze the performance of OS-ZD codes by implementing the new coding scheme in a real EC library.The results show that compared with the existing ZD codes,OS-ZD codes have the theoretically lowest storage overhead while maintaining high fault tolerance.Compared with RS codes which are widely used in storage systems,the OS-ZD codes have extremely low computation overhead,reducing the impact of data read,write,and recovery on the performance of the storage system. |