Efficient Erasure Coding in Distributed Storage System

Posted on:2018-04-27

Degree:Ph.D

Type:Dissertation

University:University of Toronto (Canada)

Candidate:Li, Jun

Full Text:PDF

GTID:1478390020456299

Subject:Computer Engineering

Abstract/Summary:

Distributed storage systems store a substantial amount of data on many commodity servers. As servers failures are common, it is critical for distributed storage systems to store redundancy to tolerate such failures. Conventionally, a distributed storage system replicates data as the redundancy. Recently, erasure coding has been increasingly replacing replication thanks to its lower storage overhead. However, in many scenarios, erasure coding incurs additional overhead, such as higher network traffic, or lowers the performance of data accesses. In this dissertation, we address some of such challenges in two broad areas.;Erasure coding with the optimal network overhead. Traditional erasure codes incur high network overhead when data needs to be reconstructed after a server failure. We study the problem of constructing erasure codes that consume the optimal network traffic to reconstruct data from multiple failures. We start from a new construction of minimum-storage cooperative regenerating (MSCR) codes that reconstruct data from two failures with the optimal network traffic. We show that an existing minimum-storage regenerating (MSR) code is also an MSCR code for two failures, and vice versa. For more general cases, we propose Beehive codes that optimize the volume of network traffic to reconstruct data from more than two failures, with storage overhead only slightly higher than optimum.;I/O efficient erasure coding and systems. Traditionally erasure coding incurs higher I/O overhead because of its encoding and decoding operations. In this dissertation, we propose solutions to minimize the overhead of writing and reading erasure-coded data. On the input side, we design and implement Mist, a new mechanism for disseminating erasure-coded data efficiently to multiple receiving servers in data centers. On the output side, we exploit the demand skewness in distributed storage systems and propose Zebra, a framework that encodes data into multiple tiers dynamically by their demand to reduce the overall overhead to read erasure-coded data. We also investigate the data parallelism of erasure coding, which may affect the performance of running parallel data processing jobs on the erasure-coded data, such as MapReduce, and construct Carousel codes that allow data parallelism to be expanded into an arbitrary number.

Keywords/Search Tags:

Data, Distributed storage, Erasure, Failures, Codes, Network traffic, Overhead

Related items

1	Research On Erasure Codes And Data Repair Techniques For Multiple Node Failures
2	Research On Erasure Codes For Distributed Secure Storage System
3	Research On Erasure Code Based Data Fault-tolerant Technology In Distributed Storage
4	Research On Access Optimization Technologies Of Distributed Storage Based On Erasure Code
5	A Comprehensive Data Redundancy Policy For High Reliable Storage Systems
6	Research On Backup And Repair Technologies Based On Erasure Codes In Distributed Storage Systems
7	Erasure Codes For Fast Data Repairing In Distributed Storage Systems
8	Research On Data Repair Techniques For Erasure Codes In Distributed Storage System
9	Design Of Storage Scheme For Low Storage Overhead Binary Vector Coding Based On Erasure Codes
10	Based On Erasure Codes Distributed Storage System Design And Implementation