Font Size: a A A

Research On Erasure Code-based Data Fault-tolerant Technology For Cloud Storage

Posted on:2020-09-26Degree:DoctorType:Dissertation
Country:ChinaCandidate:F L XuFull Text:PDF
GTID:1368330611492954Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
Cloud storage,built on a large number of nodes,is cheap and flexible and easy to use.It is widely used to preserve massive amounts of data that people quickly generate.According to node distribution,cloud storage can be divided into three types: single center cloud storage,cross-center cloud storage,and P2 P cloud storage.The first two types provide services by running one or multiple data centers with a large number of servers;P2P cloud storage provides services by renting lots of spare storage space and network bandwidth from individuals.For any kind of cloud storage,it is important to adopt faulttolerant technology to ensure that the data will not become lost due to frequent node failures.Fault-tolerant techniques based on erasure codes are more fault-tolerant and more efficient in storage.However,as the erasure-code-based fault-tolerant technology is more complex,it is faced with the following problems in cloud storage.(1)Data encoding includes data fragmentation,data calculations,and data distribution,etc.Existing encoding methods either consume too many I/O resources or have low data read and write speeds;(2)When recovering data,each failed block needs to transmit multiple blocks and perform complex calculations.Existing data recovery methods cannot effectively reduce data transmission overhead and improve data recovery efficiency.In recent years,the rise of cross-center cloud storage and P2 P cloud storage has made these problems more prominent.In view of these technical difficulties,this paper comprehensively considers the characteristics of various types of cloud storage and conducts in-depth research on data encoding technology and data recovery technology,with the main contributions as follows:Existing data encoding methods either lead to excessive network transmission and disk read and write because of transformamtion between fault-tolerant technologies,or seriously reduce the data read and write speed because of the discrete block layout,especially not suitable for cross-center cloud storage.Aiming at this problem,this paper proposes a pipeline-based distributed cumulative encoding(PDCE)for single-center cloud storage and cross-center cloud storage.PDCE uses continuous block layout,uses pipelines to write data to multiple nodes,incrementally encodes the data when they flow through encoding nodes and caches the intermediate data in memory.As data is written,PDCE gradually generates parity blocks in multiple encoding nodes respectively and eventually stores them into the corresponding node.PDCE can obtain the optimal data write performance and does not need to transform the fault-tolerant technology.By adjusting the number of encoding nodes,PDCE can flexibly balance the fault-tolerance and network transmission overhead before encoding is completed.Theoretical analysis and numerous experiments in single-center and cross-center environments show that PDCE can achieve a better trade-off between network encoding overhead and data read/write performance.Specifically,compared with existing data encoding methods,PDCE can reduce network transmission by 44.5% to 48.4% and disk reading/writing by 45.6%-66.7% while achieving near-optimal data read and write performance.In existing data recovery methods,most of the data transmission will go through the bottleneck links in the network topology,such as the links in the core layer of the data center internal network topology and the links between data centers.This not only severely limits the overall recovery performance,but also adversely affects the normal read and write of data in the system.In view of this problem,this paper proposes a localityaware tree-structured recovery(LATR)for single-center cloud storage and cross-center cloud storage.LATR first determines data locality according to the network topology information or the delay between nodes.Based on this LATR builds a minimum spanning tree that covers all providing nodes as the reconstruction tree.When recovering,data is transmitted upward from the leaf nodes of the reconstruction tree,and then upward again after combined by internal nodes,and so on,until the recovery is completed at the root node.This reduces the amount of data flowing through upper links by allowing closer data to be merged and then transmitted further to merge with other data.In addition,LATR adopts a locality-based providing node selection algorithm,which can quickly select the optimal providing nodes when there are multiple options.Theoretical analysis shows that traffic at the core layer of LATR is 20%-61% lower than that of existing recovery methods.A large number of experiments shows that LATR can increase the recovery throughput at least 23%,and the degradation read speed by up to 68%.The upload bandwidth of nodes in P2 P cloud storage is often far lower than its download bandwidth,making data upload a serious performance bottleneck.None of existing data recovery methods take this feature into account,which leads to their low efficiency in data recovery in P2 P cloud storage.In view of this problem,this paper proposes a fragment-based distributed star-structured recovery(FDSR)for P2 P cloud storage.FDSR adopted a “dispersion-aggregation” double-level recovery framework.FDSR first divides coding blocks into multiple identically sized coding fragments,and then uses multiple reconstructing nodes and star-structured recovery to reconstruct failed coding fragments in parallel,with each failed coding fragment using different providing nodes,and then sends reconstructed coding fragments to replacing nodes to complete the recovery.By making as many available nodes as possible serve as providing nodes,FDSR reduces the amount of data that a single providing node needs to upload.Using the star-structured recovery as the underlying recovery method,FDSR can be applied to both single-point failure and multi-point failure recovery,while the decentralized recovery method of multiple reconstructing nodes avoids the load imbalance problem of the star-structured recovery.The theoretical analysis shows that the data uploaded by the providing nodes in FDSR is significantly lower than the existing methods,and the overall load of FDSR is more balanced.A large number of experiments show that compared with existing data recovery methods,FDSR can increase the data recovery speed by 33.2% to 87.8% when recovering from single failure,and the data recovery speed by 78.4% to 110.0% when reeovering from multiple failures.
Keywords/Search Tags:Cloud Storage, Data Fault-Tolerant, Erasure Code, Data Encoding, Data Recovery
PDF Full Text Request
Related items