Font Size: a A A

A Component-Based General ETL Tool With Code-Generation

Posted on:2008-05-14Degree:MasterType:Thesis
Country:ChinaCandidate:G L KeFull Text:PDF
GTID:2178360215495596Subject:Computer application technology
Abstract/Summary:PDF Full Text Request
Data warehousing can integrate a large amount of operational data from multiple distributed, heterogeneous data sources as analytical data, so as to provide the users a uniform environment to access the data. It can make the enterprise scratch much more valuable information, and also make it much easy to search for the information they needed. So it will obviously enhance the enterprise's competitive power. ETL is dedicated to the integration of data from multiple data sources, it is the backbone of data warehousing. As the survey reveals, ETL design and development work consumes 60 to 80 percent of an entire DW(Data Warehousing) project. While current ETL tool have a series of short-comes, such as poor characteristic in common use, complexity in use, high cost etc. Some enterprises, especially medium and small ones, tend to manually code their own ETL modules, this will inevitably prolong their DW projects.This article introduces an ETL tool that was code-generation and component based. It first defines a series of component based on the characteristic of ETL process. Every component takes on one or more task, and links can be added among the components responsible for ETL workflow. Then the execute code can be generated automatically. And at last the ETL job can be worked by executing that code.The component-based mechanism makes the ETL tool have good extensibility and generic-usability, and it also eases the job of constructing an ETL job; the auto code-generation mechanism releases the programmers from trivial coding work.
Keywords/Search Tags:Data Warehousing, ETL, Data Integration, Component-based, Code-generation
PDF Full Text Request
Related items