Font Size: a A A

Design And Implementation Of Incremental Data Extractor Based On Sector Inquiry

Posted on:2016-11-25Degree:MasterType:Thesis
Country:ChinaCandidate:L WenFull Text:PDF
GTID:2308330482964370Subject:Computer technology
Abstract/Summary:PDF Full Text Request
We have much opportunities and challenges in the rapid development of the information era. The information explosion has strongly influenced the maintenance and other aspects of our data integration,data update, and so on. Therefore, the data from the other different sources is synchronicity updated to other places(such as data warehouse).It that is currently a big problem. ETL(Extraction Transformation Loading) is an important part of creating data warehouse, and it responsible for the distribution of data in heterogeneous data sources such as relational data, data files, and the data to the temporary middle layer after processing. Finally,it should be loaded into the data warehouse.This thesis mainly studies the incremental data extraction and unified in different databases, especially a variety of incremental data extraction methods at present. Through the mechanism of several common incremental extraction, such as time trigger mode, comparison, stamp, log mode and CDC(Change Data Capture) method, we have summed up the characteristics of various mechanisms and the analysis of their advantages and disadvantages. After analyzing the time stamp method on the basis of the implementation process, we have designed a solution: according to the high frequency of the traditional database, this thesis uses some of the field as the time stamp attribute, to specify the traditional database, data table to carry out a continuous query, to overcome the time stamp method to destroy the source of the structure of the missing sink, the data source on the incremental data extraction.Finally, this thesis uses Java Swing, multi thread, database technology to realize incremental data extraction tools. The performance by the software and SQL Server(for example, accuracy, query time and other indicators) have used to verify the feasibility and efficiency of our scheme.
Keywords/Search Tags:Incremental data extraction, ETL, Section of the query, Multi thread, The time stamp method
PDF Full Text Request
Related items