Font Size: a A A

Research On Pipelined Reconstruction Scheme For Multiple Failures

Posted on:2022-08-03Degree:MasterType:Thesis
Country:ChinaCandidate:P L LaiFull Text:PDF
GTID:2518306572986299Subject:Computer technology
Abstract/Summary:PDF Full Text Request
In the era of big data,the distributed storage system,as the main platform for processing and storing data,needs to deal with massive data.With the increase of its scale,the distributed storage system will fail,which leads to data damage or even causes permanent data losses.Normally the erasure code mechanism with high storage efficiency is adopted to prevent such failures.However,erasure code systems are faced with challenges.On the one hand,the reconstruction of erasure code is time-consuming and multi-node failure scenarios occur frequently in the system,both of which increase the probability of data losses and require the distributed system to have higher recovery performance.On the other hand,the random layout of data can easily lead to unbalanced reconstruction load of each node during reconstruction.This thesis explored the reconstruction performance of erasure code system when multiple nodes fail and the reconstruction load of each reconstruction node.In order to accelerate the multi-node reconstruction,a multi-node reconstruction was proposed based on Pipelined Reconstruction(PipeRec).In the PipeRec Scheme,the technology of pipeline segmentation was proposed for the purpose of accelerating network transmission.Specifically,a number of short pipelines were designed to reduce the number of execution segments of the pipeline,thus multiple nodes could be reconstructed in parallel thus all the reconstruction can be started at the same time,which increases the parallelism of reconstruction.In order to speed up the disk I/O process,the disk I/O,network transmission and computation procedures were processed by multi-threads in the reconstruction,and the network transmission,disk I/O and decoding procedures were processed in parallel,so that the disk I/O process occurred during the network transmission.To explore the factors affecting the reconstruction time,a pipeline reconstruction time model was established.Through the theoretical analysis of the reconstruction time model,it was found that the number of failure nodes and the number of the slices in the pipeline were the main factors restricting the reconstruction time of the PipeRec Scheme.A reconstruction load balancing mechanism was subsequently introduced to make full use of the system resources.To be specific,an algorithm was designed to quantify the reconstruction tasks of each reconstruction node.In addition,the reconstruction nodes of each stripe were considered from a global perspective to minimize the differences of the amount of reconstruction tasks of each node,so as to achieve the reconstruction load balance.In order to quantitatively evaluate the PipeRec Scheme,two baselines,Traffic Efficient Repair Scheme for Multiple Failures(TERS)and Repair Pipeline Scheme(RP)were developed,which employed one node and multiple nodes respectively to restore the data of multiple failure nodes.The experimental results revealed that,compared with the TERS Scheme and theRP Scheme,the PipeRec Scheme could reduce the reconstruction time by31.7% ? 66.6% and 12% ? 39.4% respectively,as the pipeline split design reduced the number of pipeline execution segments and optimized the network transmission time.At the same time,each node handled multiple reconstruction processes in parallel,which optimized the disk I/O time.Furthermore,the experimental findings suggested that the PipeRec Scheme could achieve better reconstruction load balancing effect.
Keywords/Search Tags:Distributed storage system, Erasure code, Reconstruction, Load balance
PDF Full Text Request
Related items