Font Size: a A A

Research On Key Technology In Data Backup Based On Deduplication

Posted on:2014-03-07Degree:MasterType:Thesis
Country:ChinaCandidate:C S DengFull Text:PDF
GTID:2268330401966241Subject:Computer software and theory
Abstract/Summary:PDF Full Text Request
With the rapid development of information society, information data showexplosive growth, the data volume of current global data center is huge. According tostatistics, the huge amounts of data as much as60%is duplicate data.And these dataneed to be constantly storage and transmission, so that this wastes a lot of storage spaceand network bandwidth, and results in the increased cost of data storage andmanagement. Therefore how to utilize the deduplication technology to eliminateredundant storage of huge amounts of data has become the storage industry that needsto solve one of the important issues at present.Based on the deduplication technology, on the basis of this paper studies are thesimilarity of data backup and recovery. And the main work and innovations are asfollows:(1) Puting forward an EDBRA backup system based on the linear Delta chian:when recovering data, EDBRA availably avoids the intermediate version files recovery,instead of calculating the Delta file, and then recover the required version file viarunning the Delta decompression algorithm. So EDBRA well solves the problem whichthe low efficiency of file recovery in the backup system based on linear Delta chain.And on this basis, this paper designs the EDBRA backup system based on the linearDelta chain. This system not only keeps the same optimal backup performance of thebackup system based on the linear Delta chain, but also keeps the superior performanceof data recovery compared with the traditional method for data recovery.(2) This paper further improves the EDBRA, and proposes a new data backup andrecovery algorithm (BD_EDBRA) based on the bidirectional Delta chain: whenrecovering data, BD_EDBRA needs to calculate the threshold of data recovery, andselects an optimal recovery strategy according to the threshold. This strategy makes thatthe costs of the data recovery time are significantly lower than EDBRA for datarecovery, and the data backup performance is only slightly increased compared withEDBRA. And on this basis, this paper designs the BD_EDBRA backup system based on the bidirectional Delta chain. In the system the performance of data recovery hasimproved significantly and the performance of data backup is similar compared with theEDBRA backup system based on linear Delta chain.(3) This paper designs and implements the EDBRA backup system based on thelinear Delta chain and the BD_EDBRA backup system based on bidirectional Deltachain, and tests the system for a large number of testing data. Experimental data showthat the data recovery performance of the former is superior to the traditional method,and the data backup performance remains the same. The data recovery performance ofthe latter is improved obviously compared with the former, and the data backupperformance is only slightly increased compared with the former.
Keywords/Search Tags:deduplication, data backup, data recovery, threshold of data recovery
PDF Full Text Request
Related items