Font Size: a A A

Research On Distributed Storage Technology Based On Data De-Duplication And CHORD Protocol

Posted on:2011-03-19Degree:MasterType:Thesis
Country:ChinaCandidate:X J JinFull Text:PDF
GTID:2178330338979939Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
With the rapid growth of data size in the information age, the amount of information increases quickly, data information has become a valuable asset to the mankind. The value of data has far exceeds the value of the computer system itself. On the other hand, various uncertain factors make the data vulnerable to lost, which will bring huge losses to the users. Therefore, faced with the challenge of massive data to the all various aspects of storage system, high efficiency data storage technology has been widely concerned.To meet the needs of massive data to the storage system, we study the existing block-level data de-duplication technology first, compare the advantages and disadvantages of the fixed-length block data de-duplication and variable length block duplication, analysis the factors which affect the efficiency of duplication. We then focus on the Rabin fingerprint-based variable-length block algorithm, propose a new document cut point search algorithm.According to the characteristics of Chord protocol and data de-duplication technology, we design the location of file resources and data duplication filtering strategy. By storing the block index information in different nodes according to the characteristics of Chord, we solve the problem of the centralized block indexing. Finally this paper proposes a distributed storage system architecture build up with Chord-based distributed storage technology as well as Rabin fingerprint-based variable-length block de-duplication technology.The experiment results show that the introduction of data de-duplication technology to the distributed storage system based Chord protocol reduces the system storage burden. Besides, the reduction of data transmission amount increases the efficiency of data backup and recovery under the low-bandwidth network.
Keywords/Search Tags:distributed storage, Chord protocol, de-duplication, Rabin fingerprint
PDF Full Text Request
Related items