Font Size: a A A

Efficient Erasure Coding in Distributed Storage System

Posted on:2018-04-27Degree:Ph.DType:Dissertation
University:University of Toronto (Canada)Candidate:Li, JunFull Text:PDF
GTID:1478390020456299Subject:Computer Engineering
Abstract/Summary:
Distributed storage systems store a substantial amount of data on many commodity servers. As servers failures are common, it is critical for distributed storage systems to store redundancy to tolerate such failures. Conventionally, a distributed storage system replicates data as the redundancy. Recently, erasure coding has been increasingly replacing replication thanks to its lower storage overhead. However, in many scenarios, erasure coding incurs additional overhead, such as higher network traffic, or lowers the performance of data accesses. In this dissertation, we address some of such challenges in two broad areas.;Erasure coding with the optimal network overhead. Traditional erasure codes incur high network overhead when data needs to be reconstructed after a server failure. We study the problem of constructing erasure codes that consume the optimal network traffic to reconstruct data from multiple failures. We start from a new construction of minimum-storage cooperative regenerating (MSCR) codes that reconstruct data from two failures with the optimal network traffic. We show that an existing minimum-storage regenerating (MSR) code is also an MSCR code for two failures, and vice versa. For more general cases, we propose Beehive codes that optimize the volume of network traffic to reconstruct data from more than two failures, with storage overhead only slightly higher than optimum.;I/O efficient erasure coding and systems. Traditionally erasure coding incurs higher I/O overhead because of its encoding and decoding operations. In this dissertation, we propose solutions to minimize the overhead of writing and reading erasure-coded data. On the input side, we design and implement Mist, a new mechanism for disseminating erasure-coded data efficiently to multiple receiving servers in data centers. On the output side, we exploit the demand skewness in distributed storage systems and propose Zebra, a framework that encodes data into multiple tiers dynamically by their demand to reduce the overall overhead to read erasure-coded data. We also investigate the data parallelism of erasure coding, which may affect the performance of running parallel data processing jobs on the erasure-coded data, such as MapReduce, and construct Carousel codes that allow data parallelism to be expanded into an arbitrary number.
Keywords/Search Tags:Data, Distributed storage, Erasure, Failures, Codes, Network traffic, Overhead
Related items