Font Size: a A A

Design、Implementation And Application Of ETL Tools Base On Process-Driven

Posted on:2014-08-10Degree:MasterType:Thesis
Country:ChinaCandidate:G Q ZhangFull Text:PDF
GTID:2268330401971989Subject:Computer application technology
Abstract/Summary:PDF Full Text Request
With the rapid development of the information society, the various departments according to their different business needs, and computer hardware and software structure at different points in time to save a wide range of historical data, these data there is often a lot of repeat, sensitive inconsistent, and other data quality problems, while inter-departmental data is difficult to achieve the consistency of the data in order to form a complete data structure. So we need to integrate and compare the data to realize the sharing and reuse of inter-departmental data.ETL is short for Extraction-Transformation-Loading, respectively called extraction, transformation and loading. ETL is the cornerstone of inter-departmental data sharing, its value lies in the unified interdepartmental data standardization, centralization and complete oriented. Primarily responsible for the distributed, heterogeneous data sources, data extraction to a temporary database for cleaning, conversion, and integration, and finally loaded into the target repository is the foundation for data sharing.Firstly, the different operations in the ETL are abstracted into the corresponding node by introducing the concept of workflow, and the node model is given. The different nodes are connected together to form ETL process to make the entire ETL process procedural, ensuring the ETL tool is more flexible and universal than the traditional methods. Secondly, this paper combined with the metadata idea completes the overall design and module implementation based on process-driven. Finally, this paper describes the application case based ETL tools—Social Credit Joint Credit Information System.In this paper, the main innovations are as follows:1、ETL combine the ETL process relates to data sharing with workflow.ETL abstracts the various operations into corresponding node and ETL according to the specific business needs of these nodes in a certain order to configure ETL process.ETL complete the process through the implementation of the entire process. In the entire ETL process during configuration, each node can be configured in any order, this greatly improves the flexibility and versatility of data cleansing and data matching.2、 During the execution of the ETL process, ETL process is parsed into a SQL statement to executed, using non-procedural to process data. And in that way can improve the efficiency of the implementation of the ETL.
Keywords/Search Tags:Data Sharing, ETL, Process-Driven, Data Cleaning, Data Comparing
PDF Full Text Request
Related items