Font Size: a A A

Research And Optimization Of Distributed File System Storage Based On HDFS

Posted on:2016-05-25Degree:MasterType:Thesis
Country:ChinaCandidate:B LiFull Text:PDF
GTID:2308330473957167Subject:Computer system architecture
Abstract/Summary:PDF Full Text Request
Distributed file system arises at the historic moment and has been widely used nowadays, as the traditional storage system can’t meet the demand of the explosive growth of massive data. Traditional distributed file system provides reliability through block replication. For three-replicated system, data is split into blocks and three copies of each block are stored in different data nodes. The major disadvantage of triple replication is the large storage overhead. With the growth of the data, storage needed by system will be growing faster than hardware infrastructure’s extension speed, resulting in a major data center cost bottleneck.Erasure coding techniques achieve higher data reliability with considerably smaller storage overhead. Reed-Solomon(RS) codes are the standard design choice. The repair cost of distributed file system with erasure codes(such as RS) is often considered an unavoidable high price to pay for high storage efficiency and high reliability. RS codes may create 12× overhead in repair bandwidth and disk I/O compared with three-replication system when a single block is lost.We optimize RS codes by adding additional parities for the data blocks. For RS(12,4)codes, 4 parities are created for every 12 data blocks for which we call it global parities.We divide 12 data blocks into three groups and then, one more additional local parity is created for each group. The locality of RS(12,4) codes is reduced from 12 to 4, which means it only needs 4 blocks to repair one single lost block compared with 12 before optimized. Further more, we prove that it is possible to repair single global parity with the help of three local parities. And it costs less storage as we don’t have to store additional parity for global parities. We provide a reliability analysis using a standard Markov Model. We observe that RS codes after optimization achieves almost 100× reliability compared to the original RS codes. We implement our new codes in Hadoop HDFS and compare to a currently deployed HDFS module that uses RS codes. The recovery cost of single block is reduced to 39% on the repair disk I/O and repair network traffic with the cost of only 19% more storage overhead compared with RS codes. Because the new codes repair failures faster, this provides higher reliability.
Keywords/Search Tags:Distributed File System, Erasure Codes, Storage
PDF Full Text Request
Related items