Font Size: a A A

Erasure Codes for Optimal Node Repairs in Distributed Storage Systems

Posted on:2015-03-13Degree:Ph.DType:Dissertation
University:Princeton UniversityCandidate:Goparaju, SreechakraFull Text:PDF
GTID:1478390017993587Subject:Electrical engineering
Abstract/Summary:
Google, Amazon, and other services store data in multiple geographically separated disks called nodes, among other reasons, to safeguard the data from node failures. Standard techniques for such a distributed way of storage include multiple backups (typically triple replication) or using erasure codes such as Reed-Solomon codes. The latter codes are the most space-efficient for a targeted worst-case number of simultaneous node failures. They are extremely inefficient how- ever for repairing the frequently occurring single node failure. Replication provides the most cost-effective repair in this scenario but ultimately is an unwise option in today's data proliferation. New erasure codes are therefore required to simultaneously optimize storage efficiency, worst-case resilience and repair costs for single node failures.;This dissertation looks at two such erasure codes: regenerating codes, which optimize the communication costs, and locally repairable codes (LRCs), which optimize the I/O costs (number of nodes contacted). Regenerating codes store a file of size M on n nodes and trade-off the amount of data stored alpha per node for the amount of bandwidth gamma used to repair a node. This dissertation presents new code constructions and thereby, state-of-the-art inner bounds for this trade-off region. A lower bound is also provided for alpha for codes achieving the optimal storage point in the trade-off, signifying the necessity of storing an exponential number of symbols. Ideas developed in this analysis have been applied to establish the optimal file size that can be securely stored in the presence of an eavesdropper, when the corresponding regenerating code is at the optimal storage point.;Locally repairable codes, on the other hand, can be viewed as classical erasure codes of dimension k, length n, distance d and a new parameter r, called locality. In storage parlance, an LRC of locality r stores a file of size k on n nodes such that when a node fails, there exist r other nodes that suffice to reconstruct the failed node. Previous considerations on optimality have largely ignored the finite field involved. This dissertation provides codes on the binary field that optimize k for certain families of parameters n, d, and r..
Keywords/Search Tags:Codes, Node, Storage, Optimal, Repair, Data, Optimize
Related items