Font Size: a A A

Research On Digital Fountain Codes And The Distributed File System Backup Technology

Posted on:2017-04-03Degree:MasterType:Thesis
Country:ChinaCandidate:K ChengFull Text:PDF
GTID:2348330503495886Subject:Engineering
Abstract/Summary:PDF Full Text Request
In the era of big data, the global data is doubling in size every two years,and the increasing rate is similar to Moore theorem. As increasingly important asset, data has attracted wide attention in academia and industry.The distributed file system, is an important innovation and revolution in storage industry technology and service, which meets a lot of users' storage requirements for a low price, large amount, safety and stabilization, and let different customers, different applications and different screen experience in sharing information and interacting service across time and space.LT codes is the first practical digital fountain codes, whose encoding and decoding algorithms are very simple, and the time complexity of encoding and decoding algorithms is low. Applying LT codes in the distributed file backup system, can reduce the storage system capacity. However, applying the method directly can degrade the data access performance. In many distributed file system, HDFS has high reliability, scalability and cheap cost, because it is deployed in a large of number of commercial personal computers. However, the HDFS multiple replications strategy causes the bottlenect for the storage system extension. In this paper, we combines the advantages of LT codes and HDFS on data backup, put forward HDFS dynamic replications storage strategy based on LT codes, and verify the reliability of dynamic replications storage strategy in theory. Finally, we design and implement HDFS cloud storage system based on LT codes.The main work and contributions of this paper include the following aspects:1. Study the related principle of LT codes, including the degree distribution design, encoding and decoding algorithms, etc. Based on the above, analyze its' application prospect on the distributed file backup. Explain the basic characteristics of HDFS in detail, then point out the advantages and disadvantages of HDFS backup. Combine the advantages of LT codes and HDFS on the distributed file backup finally, put forward HDFS dynamic replications storage strategy based on LT codes, and verify the reliability of dynamic replications storage strategy in theory;2. According to the actual needs of cloud storage system, design HDFS cloud storage system based on LT codes in detail, including the cloud storage system architecture design, the client subsystem design, the server subsystem design, HDFS cluster subsystem, etc. Then according to users' needs, the system architecture and every subsystem, design the database in detail.3. According to the detailed design, implement HDFS cloud storage system based on LT codes.Then test the login function module, the backup function module and the restore function module, verify whether the reliability of dynamic replications storage strategy combines the advantages of LT codes and HDFS on the distributed file backup.In this paper, having researched on LT and HDFS, put forward HDFS dynamic replications storage strategy based on LT codes, and implement HDFS cloud storage system based on LT codes, which has realistic exploratory meaning on erasure codes and the distributed file systems.
Keywords/Search Tags:LT Codes, Distributed File Backup, HDFS, Dynamic Replications, Cloud Storage
PDF Full Text Request
Related items