| Erasure coding is widely used in distributed storage systems due to its high storage efficiency and strong fault tolerance.However,redundancy settings of mainstream storage systems are configured in a one-scheme-for-all fashion,and mainly use a narrow stripe encoding scheme when deploying erasure coding.As a result,the scaling performance of the storage system is poor.For data with low peformance requirements for later access,excessive redundancy results in unnecessary space overhead.Research has shown that the failure rate of storage devices in large-scale storage systems can vary significantly over time,and due to the heterogeneity of storage devices,dynamically adjusting the storage data redundancy scheme during different failure rates can effectively reduce the physical storage space.The Convertible code framework performs efficiently code parameter conversion on encoded data,ensuring the code properties before and after parameter conversion,and yields substantial savings in system resources compared to the traditional default re-encoding scheme.Therefore,to improve the scaling performance of the existing erasure coding scheme and further save storage space,based on the existing Convertible code framework theory,the following two pieces of research are carried out around the problem of storage scaling and wide stripe generation:(1)To improve the storage scalability as much as possible and consume fewer system resources during scaling,the Convertible code framework is used to study the storage scaling problem.The scalar MDS Convertible code and the vector MDS Convertible code are used to scale from different dimensions: for the scalar Convertible code,this thesis proposes Scale-out scheme for increasing nodes and Scale-in scheme for reducing nodes.And a Scaleout scheme for increasing nodes is proposed for vector Convertible codes.In addition,the parameter design of the scaling scheme is analyzed,the coding and decoding performance of the code before and after the extension are simulated and compared.Finally,this thesis compares the existing storage scaling schemes.It’s proved that the storage scaling based on the Convertible code framework can improve the scaling performance of the existing encoding scheme and compared with the default re-encoding scheme can effectively reduce the scaling bandwidth.(2)Based on the superregular Hankel matrix for constructing Convertible codes,a new complete merging mechanism is proposed.The bandwidth cost of the wide stripe generation scheme based on the new complete merging mechanism is zero,and the optimal merging can be achieved.However,due to the high complexity of the optimal wide stripe generation scheme and the complex block placement scheme for narrow stripes in the actual storage system,two algorithms are proposed for the new complete merging mechanism: CNSM_G algorithm and CNSM_P algorithm.The algorithms are used to quickly finish narrow stripes merging to generate wide stripes while minimizing the overall generation bandwidth.Finally,this thesis compares the complete merging mechanism in the wide stripe generation schemes.And the CNSM algorithm simulation results are given,which proves the feasibility of wide stripe generation based on the Convertible code framework. |