Research And Implementation Of File Incremental Synchronization Based On Data Block

Posted on:2021-08-08

Degree:Master

Type:Thesis

Country:China

Candidate:K Yu

Full Text:PDF

GTID:2518306464983649

Subject:Computer Science and Technology

Abstract/Summary:

PDF Full Text Request

With the advent of a new era of big data,it is inevitable that organizations will continue to accrue large amounts of data,not only from traditional sources that are product level focused but also from digital outlets such as mobile devices,social media networks or the Internet of Things.Therefore,information safety has become more and more important for enterprises and industries and ensuring adequate disaster recovery backup in the event of an accident has become an increasingly important research direction.Because of the volume and velocity features of Big Data,it is necessary to synchronize the source data to the backup server in a fast and effective manner.The traditional synchronization methods suffer from problems such as occupying large amounts of storage space and high network bandwidths,and low synchronization efficiency when dealing with Big Data.This thesis proposes to complete incremental identification based on data block algorithm and Bloom filter and designs an incremental synchronous backup tool based on this.This thesis first introduces the current research status of data synchronization backup to clarify the needs and goals,and analyses related technologies including data block algorithm,Bloom filter,Inotify mechanism,etc.The block algorithm part compares the fixed length block and the variable length Partitioning,focusing on the Rsync algorithm and RAM algorithm,and analysis of their characteristics and shortcomings,and introduces the standard Bloom filter and some improved Bloom filters based on it.Secondly,this thesis proposes an improved RAMM algorithm and a non-partitioned single hash Bloom filter to overcome the problem of the long block of RAM algorithm and the shortcomings of requiring multiple high-demand hash functions of the standard Bloom filter,and experimentally and analytically verifies the rationality and effectiveness of the improved algorithm.Thirdly,an incremental synchronization backup tool is designed and implemented in a hierarchical and modular manner,which mainly contains four modules,network transmission module,data monitoring module,data synchronization module and control module.The monitoring module mainly uses the Inotify mechanism To realize the monitoring of files to achieve the purpose of real-time synchronization,the synchronization module mainly uses the RAMM block algorithm and the non-partition single hash bloom filter to achieve incremental recognition and synchronization.At the end of the thesis,we conducted a series of tests on the incremental synchronization backup tool.The test results show that compared with the full synchronization,the improved RAMM block algorithm and the non-partition single hash bloom filter can efficiently complete the synchronization backup and reduce network bandwidth and memory consumption.It also performs well while applied to the Ceph distributed file storage system built on the Open Stack cloud computing platform.

Keywords/Search Tags:

Data Chunking, NPSHBF, RAMM, Incremental Synchronization, Data Backup

PDF Full Text Request

Related items

1	Data Backup System Base On Dynamic Length Blocking Imcremental Backup Algorithm
2	Research And Implementation Of OAA Data Engine Based On The Model Of Aggregation On Supply And Demand
3	Design And Implementation Of Private Data Backup And Recovery Applications Based On Java
4	Research And Implementation Of Data Backup Technology Based On Incremental Mode
5	Research On Distributed Data Full Backup And Incremental Backup Of File System
6	Design And Implementation Of Data Backup And Recovery System Of Sichuan Provincial Electric Power Company Technical Skills Training Center
7	The Design And Implementation Of File Matching Method In Data Backup & Recovery System
8	Research And Implementation Of Mobile Terminal Data Backup
9	New Generation Of Net-Disk Research On Supporting Automatic Data Backup And Sharing
10	Based On Linux Data Backup And Recovery System Design And Implementation