Font Size: a A A

A Lightweight Delta Synchronization Approach For Cloud Storage Services Inspired By Data Deduplication

Posted on:2022-08-20Degree:MasterType:Thesis
Country:ChinaCandidate:Y HeFull Text:PDF
GTID:2518306569997419Subject:Computer technology
Abstract/Summary:PDF Full Text Request
In recent years,the amount of global data has exploded,and the size of synchronized files has also increased in the cloud storage services.Delta synchronization,as a synchronization method that only transmits differences,effectively reduces the scale of network traffic and eases the growing pressure of synchronizing large files.However,with the popularity of devices with less computing such as mobile phones and IOTs,rsync,as a classic delta synchronization methods,gradually cannot meet the needs of lightweight synchronization scenarios.The main reason is that the Rsync protocol needs to calculate the weak fingerprint byte by byte,which is a time-consuming operation to solve the "block offset" phenomenon caused by fixed-length blocks,which brings a great deal to devices with limited computing resources.Inspired by the data deduplication technology,this paper replaces the fixed-size chunking(FSC)in delta synchronization with the content-defined chunking(CDC)in data deduplication to avoid the phenomenon of "block drift",thereby eliminating The process of calculating weak fingerprints byte by byte.In order to solve the problem of extra time overhead caused by CDC,this paper uses the hash value generated in the CDC calculation process to generate a weak fingerprint with a low collision rate,instead of the weak fingerprint in the original delta synchronization protocol.Therefore,this paper has developed a delta synchronization method,CDCsync,that requires less computing resources and is suitable for lightweight scenarios.In order to further reduce the computational overhead and metadata overhead,this thesis proposes two strategies of separating strong and weak fingerprint comparison and merging consecutive matching blocks based on the strong and weak fingerprint attributes and the locality of file editing,forming the final version of Dsync.Our evaluation results driven by both benchmark and real-world datasets suggest Dsync performs 2x-8x faster and supports 30%-50%more clients than the state-of-the-art rsync-based WebR2sync+and deduplication-based approach.
Keywords/Search Tags:lightweight delta synchronization, content-defined chunking, cloud storage
PDF Full Text Request
Related items