Font Size: a A A

The Research And Implementation Of Massive Data Synchronization System For Database Based On Log Parser

Posted on:2017-07-03Degree:MasterType:Thesis
Country:ChinaCandidate:F L SongFull Text:PDF
GTID:2348330536953084Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
With the rapid development of e-commerce and financial industry,storage and processing of data already entered the era of massive data.Meanwhile,the database has been developed from centralized to distributed.Data exists in scattered locations on different servers.This ensures highly reliable operation of enterprise systems.This in turn introduces a key issue,how to maintain data consistency on different distributed nodes.Based on the background of the times and the technology,this paper presents a solution of the synchronization system based on log parser,it designs and implements a prototype for data synchronization system.Firstly,this paper investigates the impact of the abnormal data in each industry,expounds the necessity and urgency of the data synchronization construction,and briefly analyzes the current situation and the application of the technology at home and abroad.Aiming at solving several key problems of data synchronization,summarizes the current mainstream data synchronization and incremental data capture methods,and compare the advantages and disadvantages of each method.Based on these selected feasible technology,using log parser to reshape SQL statement and file filter driver to capture log files in real-time incremental data.Secondly,a lot of experiments are done in order to extract the SQL operation information from the binary format of the log file.According to the experimental data we obtain the detailed internal structure of the log file.The logical structure of log file can be identified.We can obtain the outermost redo block structure,the middle layer of redo record structure and the innermost redo change vector structure.The content contained in each layer structure can be acquired.And then,each atomic operation could be distinguished to corresponding Change Vector structure.We can identify the structure stored in the field information and the rows corresponding to physical data block address,and thus reconstruct the SQL statement.Then,on this basis,we design a prototype of data synchronization system.We describe the system overall frame and the logical structure of each subsystem in graphical form,and architecture design,the processing flow of each subsystem,data storage and the key technology are illustrated in detail.The system mainly includes four parts: the log monitoring subsystem,log parsing subsystem,data transmission subsystem,and writing data subsystems.In the system implementation and testing section,we illustrate the functions related to each subsystem and functional modules,and present the key source code.Meanwhile the functional testing,performance and compatibility testing for the data synchronization system are carried out.It verifies the feasibility,reliability and scalability of the system.Finally,the article summarizes the content and the existing problems,and puts forward suggestions for the improvement of the system.
Keywords/Search Tags:database, data synchronization, redo log, file system filter driver
PDF Full Text Request
Related items