Font Size: a A A

Research On Distributed Fault-Tolerant Storage Technology Based On Erasure Code

Posted on:2017-12-04Degree:DoctorType:Dissertation
Country:ChinaCandidate:X Q PeiFull Text:PDF
GTID:1368330569998431Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
With the development of the network and information technologies,the distributed storage systems have achieved rapid progress in terms of cost,performance,scalability and space.However,they are facing the huge challenge of fault tolerance: the increacing number of nodes increases the probability of node failure in distributed storage systems,which may cause the catastrophic result due to the data lost.Thus,how to design the efficienct and reliable technology for distributed fault-tolerant storage have become the top priority in distributed storage systems.Compared with the replication,the erasure coding could apparently reduce the storage cost with the same level of data reliability,which has become an important research point in the distributed storage field.The challenges the distributed fault-tolerant storage technology is facing include:(1)The insertion of the erasure-coded data involves the operations of data dividing,data encoding and data transmitting,where the source node may become the bottleneck to deal with these operations.Moreover,the increase of the data volume may worsen this situation.(2)The growing node number increases the possibility of multiple failures,where the existing repair approaches show high traffic cost and low efficiency when facing the situation of multiple failures.(3)It involves large data transmission volume and complex data computation for erasure codes to complete the update,where the existing update approaches consume large traffic and result low efficiency.To this end,this dissertation deeply studies the techniques for data insertion,data repair and data update to achieve the efficient and low-cost storage service.Under the single bottleneck problem of the existing data insertion approaches for the erasure-coded data,this dissertation proposes a decentralized redundancy generation scheme with grouping based on the codes with locality,called D2 CP,where a decentralized framework with grouping is proposed to connect the source nodes,the data nodes and the parity nodes.Firstly,D2 CP adopts a data placement algorithm with consistent hashing to improve the efficiency of data placement by hashing the blocks and the locastion.Then,to reduce the network traffic cost for inserting the data,D2 CP adopts a data sending scheduling algorithm with grouping,which schedules the data sending from the source nodes and adjusts the number of source nodes according to the insertion load dynamically.Finally,to improve the efficiency for encoding,D2 CP adopts a cooperative parity generation algorithm,which completes the encoding operations during the insertion cooperatively among all the parity nodes.The testbed based on the HDFS-RAID shows that D2 CP could obviously reduce the data insertion time and the network traffic compared with the typical data insertion approaches.Specially,the comparison results tell us that D2 CP could reduce the average data insertion time by 26%,and the average network traffic cost by 24.5% under various parameter settings.To address the problem of the bottleneck of the centralized repair approaches and the high network traffic cost of the decentralized repair approaches,this dissertation proposes a decentralized adaptive repair scheme with cooperation,called DARS,to minimize the repair time with the least extra network traffic cost.Specially,DARS adopts an adaptive repair framework with the star structure and tree structure to support the repair for the single and multiple failures.Through a bandwidth-aware node selection algorithm,DARS ensures the high bandwidth between nodes by selecting nodes with higher available bandwidth.To improve the repair efficiency,DARS uses a line-structured data transmission algorithm to organize the data transmission between nodes and a core-based data distribution algorithm to organize the data transmission between the coordinator and other newcomers.For reducing the network traffic cost,DARS adopts a lazy repair algorithm within a stripe to adaptively adjust the repair occasion and the number of intersection providers to ensure the load balance.The testbed based on the HDFS-RAID shows that DARS could obviously improve the reapir efficiency compared with the typical data repair approaches.Specially,DARS reduces the average repair time by 29% and 55% respectively under various parameter settings.Under the complex data transmission and data computation for updating the erasure-coded data,the update efficiency is greatly affected by the scale of the data volume.To this end,this dissertation proposes an adaptive update scheme with the tree structure for single update,called TA-Update.Specially,TA-Update adopts a parameter-independent update tree to maintain the data connections between nodes,supporting the encoding algorithms with different paramters.Through a rack-aware tree construction algorithm,TA-Update constructs an optimal update tree to ensure the transmission efficiency between nodes.To improve the computation efficiency,TA-Update uses a top-down data processing algorithm,which pipelines the data transmission between nodes and distributes the data encoding operations among all the participating nodes.To improve the adaptivity,TA-Update uses a rollback-based failure processing algorithm,which efficiently handles the node failure during update and restores the paused update progress.The testbed based on the HDFS-RAID shows that TA-Update could obviously improve the update efficiency for single update compared with the typical data update approaches.Specially,the comparison results tell us that TA-Update could reduce the average update time by 33% when there is no node failure,and by 44% when there is one node failure during the update.When facing the multiple updates,the serial update pattern results in the high network traffic cost and the update efficiency is greatly affected by the scale of the data volume.To this end,this disseratation proposes a grouped and pipelined update scheme based on erasure codes for multiple updates,called Group-U.Specifially,Group-U adopts the update framework with grouping to effectively organize the node connections.Through a workload-aware group algorithm,Group-U exploits the data locality within each group and dynamically adjusts the group size according to the update workload.Further,Group-U uses a hybrid update algorithm to organize the in-time update for data nodes and lazy-update for parity nodes,which ensures the data consistency and the update efficiency.To handle the node failure during the update,Group-U adopts a cache-based failure processing algorithm to handle the node failure by caching the data within each node,which could effectively reconstruct the failed data and ensure update move on.The testbed based on the HDFS-RAID shows that Group-U could obviously improve the update efficiency for multiple updates compared with the typical data update approaches.Specially,the comparison results tell us that Group-U could reduce the average update time by 44%,the average network traffic cost 12% under various parameter settings.
Keywords/Search Tags:Distributed Storage, Data Fault Tolerance, Erasure Coding, Data Insertion, Data Repair, Data Update
PDF Full Text Request
Related items