Font Size: a A A

Research Of Global Data De-duplication Technique In Backup System

Posted on:2014-02-05Degree:MasterType:Thesis
Country:ChinaCandidate:R LiuFull Text:PDF
GTID:2268330422963515Subject:Computer system architecture
Abstract/Summary:PDF Full Text Request
With the explosive growth of digital information, there is more and more duplicatedata in backup system, resulting in a great waste of storage resources and networkbandwidth and higher costs of processing duplicate data. Data de-duplication technologyhas been widely used in the backup system in order to reduce storage space and savenetwork bandwidth. As a kind of new technology, data de-duplication technology still facemany problems to solve.The backup system based on the user’s local de-duplication,can only delete theduplicate data in the users and can not delete the duplicate data between users, which cancause an enormous waste of storage space and network bandwidth resources. And theretrieval efficiency of the fingerprint in the system for it uses a database to storefingerprints. To resolve the above problems caused by local de-duplication, this paperpresents the global de-duplication technology to delete duplicate data between users forthe purpose of high de-duplication rate, less storage space and network bandwidth. Toimprove the efficiency of fingerprint retrieval, the two-tier fingerprint index structurebased on the the minimum hash and Jaccard similarity was proposed and to replace theprogram of database. It makes only one disk access for fingerprint lookup for each file togreatly improve fingerprint retrieval efficiency. In order to ensure global data sharedbetween users, the design and implementation of a set of multi-user concurrencymechanism was put forward to ensure data consistency and correctness.Test result shows the global data de-duplication technique can remove duplicate databetween users,greatly save storage space. The two-tier index can reduce more time offingerprint retrieval than that of local de-duplication. And multi-user concurrencymechanism can ensure stable operation of multiple users.
Keywords/Search Tags:Storage server, Global de-duplication, Two-tier index, Multi-user concurrency
PDF Full Text Request
Related items