Font Size: a A A

A Study Of The Constructions Of Codes For Distributed Storage Systems With Minimum Repair Bandwidth

Posted on:2015-06-04Degree:MasterType:Thesis
Country:ChinaCandidate:L ZhanFull Text:PDF
GTID:2308330464459715Subject:Computer software and theory
Abstract/Summary:PDF Full Text Request
Global information and data resources have increased significantly since the big data era arrives. James Nicholas Gray, the great American computer scientist who won the Turing Award in 1998, predicted in the award-winning speech that the amount of data generated in every 18 months will catch up with the total amount generated throughout history. Therefore, the requirements of large-scale storage applications increased explosively and people began to focus on the technology of distributed storage to meet those requirements.In distributed storage systems (DSS), data is stored dispersedly on multiple separate devices with high scalability and security. They can also provide reliable access to the data through different levels of redundancy. However, each individually unreliable nodes storing encoded information may fail due to various reasons. For example, nodes may be turned off/on temporarily in a planned manner such as for maintenance or energy-saving. Besides, nodes in P2P systems may frequently go offline and come back online unpredictably. With the expanding of the scales of the distributed storage systems, node failures are no longer rare exceptions but becoming more and more common instead. Therefore, the repair problem arises in the constructions of distributed storage systems: if one of the encoded storage nodes fails, the invalid encoded pieces should be recreated at an alternative node to keep the original reliability level of the system.In this paper, we studied the systems which have adopted erasure coding techniques and find that the failed encoded nodes are often repaired in a naive way:introduce an alternative node into the system and then let the added node access the other live nodes to recreate the whole original file from which the needed pieces can be computed. In this process, one would need data equivalent in amount to recreate the whole original object, even if one single node to be repaired. To make the repair process more effectively including reducing repair bandwidth and enhancing the storage efficiency, we thus focus on the constructions of codes for distributed systems with minimum repair bandwidth. Dimakis et al. introduced network coding into the distributed storage system. They finally figure out a tradeoff between the total repair bandwidth and the storage overhead. Particularly, they put forwards the concept of regenerating code. Later, Wu et al took advantage of the characters of both Maximum distance separable code and Systematic code and proposed a coding scheme in the distributed storage system which achieved the minimum repair bandwidth under the condition of d=k+1. In this thesis, we compared Wu’s construction with others which can also achieve the minimum repair bandwidth and pointed out the limitation in Wu’s construction objectively. After that, we make an extension to Wu’s construction relatively. The proposed extensive construction is very elegant in form, and much more generalized to Wu’s. It can maintain both the systematic and MDS characters in the repair process without enlarge the size of the finite field. Besides, the choices of coefficients are much more flexible which can provide more combinations to the process of repair in the extended construction. In this way, the proposed extensive construction can meet more application scenarios and realistic network conditions...
Keywords/Search Tags:Distributed storage, erasure coding, repair problem, regenerating codes, minimum repair bandwidth
PDF Full Text Request
Related items