Font Size: a A A

Optimizing Data Repair And Update For Erasure-Coded Systems With XOR-Based In-Network Computation

Posted on:2020-05-20Degree:MasterType:Thesis
Country:ChinaCandidate:Y J TangFull Text:PDF
GTID:2428330590958338Subject:Computer system architecture
Abstract/Summary:PDF Full Text Request
Erasure coding is widely used in the distributed storage systems due to its significant storage efficiency compared with replication.However,erasure coding will introduce high cross-rack traffic since: repairing a single failed data block needs to read other available blocks from multiple nodes and updating a data block triggers parity updates for all parity blocks.Excessive cross-rack traffic will result in performance degradation of many applications in the system.In order to improve the performance of erasure coding,many new erasure codes have been proposed.These new codes can fundamentally reduce network traffic,but at the expense of other performance.Another optimization method is to improve the repair and update performance by optimizing the transmission path without changing the coding structure.In addition,this optimization method can be applied to a variety of new erasure codes and the performance can be further improved by combining these two optimization strategies.However,these transmission schemes proposed at present only focus on improving the performance of the repair and update operations,and does not actually reduce the cross-rack traffic introduced by erasure coding.With the emergence of programmable network devices,the concept of in-network computation has been proposed.The key idea is to offload computations onto intermediate network devices.Inspired by this idea,new transmission schemes with XOR-based in-network computation is proposed to optimize the repair and update operations.The core idea of the two schemes is to perform XOR by the network device.In the repair operation,the data from multiple nodes are aggregated in the network and programmable network device performs XOR on the data.Finally,the result is further forwarded,thereby avoiding many end-to-end network transmissions.In the update operation,the network device calculates the delta instead of the storage node,and then sends the delta through different links,which shortens the transmission path and eliminates the network bottleneck.We implement a prototype based on HDFS-RAID and SDN to simulate an in-network computation framework.For repair operations,it makes the performance of the degraded read the same as the normal read and reduces network traffic by up to 41% compared with repair pipelining.In addition,compared with the delta-based update scheme,the update time and traffic can be reduced by up to 74% and 30%,respectively.
Keywords/Search Tags:Distributed Storage System, Erasure Coding, In-network Computation, Repair, Update, Cross-rack Traffic
PDF Full Text Request
Related items