The Study And Improvement Of Deduplication Of Files In Cloud Storage Based On Bloom Filter

Posted on:2017-03-14

Degree:Master

Type:Thesis

Country:China

Candidate:F N Lin

Full Text:PDF

GTID:2308330503468525

Subject:Computer Science and Technology

Abstract/Summary:

PDF Full Text Request

Nowadays, with the prevalence of cloud storage and more understandings from people, more and more users upload file data to cloud storage to storage files, to share with other device or users or to back up files termly. It will lead to a large amount of same files on the cloud storage if without any deduplication. Deduplication of file data reduces the memory necessary to store data, and also reduces transmission bandwidth when back up file data in case of breakdown. Deduplication brings economic benefits for enterprises of cloud storage. So deduplication plays an important role in file deduplication in cloud storage.In backup system, because of the special locality and little modification of files, the files often reappear in the same of very similar sequences. However, unlike backup system, the main data source of cloud storage is the file data from personal computers. The data source has the feature of randomness. That is, you never know which file will be uploaded to the cloud storage next time.According to the features of the data source of cloud storage, a method of deduplication of files in cloud storage based on bloom filter is proposed. In the process of file chunking, each file type uses the most effective way according to the characteristics of file types. In the process of index of file chunks, the index is based on file similarity theory and a Bloom Filter is added to accelerate the speed of chunk seeking. And because different file chunking methods has different costs when it happens false positive in Bloom Filter, so a differentiated Bloom Filter is used in order to make the total costs reach the minimum. A model of hash table- differentiated Bloom Filter- index of similar files is built in the method.In experiment, the method proposed is compared with the methods based on non-differentiated Bloom Filter in common implementation, and also is compared with AA-dedupe and Extreme Binning which have the similar implement way. The results show that the method proposed improves the performance in time consumption and memory consumption with little loss in deduplication rate.

Keywords/Search Tags:

cloud storage, file deduplication, differentiated Bloom Filter

PDF Full Text Request

Related items

1	Research On Similarity-based Secure Data Deduplication In Cloud Computing
2	Research On Security Deduplication Technology Of Cloud Storage Encrypted Data
3	Research On A File-level Data Reduplication Approach In Cloud Storage Systems
4	Research In Data-deduplication Based On Storage System
5	Research And Application Of Data Deduplication Technology Based On Bloom Filter
6	Research On Multi Cloud Dynamic Security Storage Technology
7	The Design And Implementation Of Data Deduplication With Garbage Data Removal Policy
8	Deduplication Research In Cloud Storage Environment
9	Research And Implementation Of Ciphertext Deduplication In Cloud Storage
10	Research On The Method Of Cloud Storage Deduplication For Encrypted Files