Font Size: a A A

Research On Modeling ETL Process In Data Warehouse

Posted on:2010-07-08Degree:MasterType:Thesis
Country:ChinaCandidate:R Z ZhaoFull Text:PDF
GTID:2178360302459448Subject:Computer application technology
Abstract/Summary:PDF Full Text Request
ETL integrates data with uniform rules and completes data conversion from data source to the target Data Warehouse. It is an effective program to solve problems about consistency and integration of heterogeneous data, also is an important part for the Data Warehouse to obtain high-quality data. In order to assist designers to design ETL process and code it well, reduce the cost of it's design and maintenance, as well as improve ETL tool's performance, the experts in related fields proposed an idea for modeling ETL process. At present, the study for modeling ETL process is still in it's infancy, and the technical aspects are not mature, which still have some shortcomings. In this paper, the issues about modeling ETL process on conceptual and logical class are analyzed synthetically, and new approaches for establishing ETL conceptual model and designing ETL system architecture are proposed based on these. The material contents are as follows.Firstly, the approaches for establishing ETL conceptual model are analyzed, and found that they do not descript ETL's internal mapping structure and the process of building the corresponding transformation operations in depth. So, a new approach for modeling ETL process based structure graph is proposed. It describes the mapping between sources to targets, as well as the corresponding transformation operations in detail. The definition of structure graph and it's notations for the notes and edges are presented. Then the method and steps for modeling ETL process are expounded particularly though an instance.Secondly, the method for designing ETL tools and ETL system architecture is studied; the limitations of the traditional ETL architecture and the importance of Metadata to ETL process are analyzed in detail. Based on these, the traditional ETL architecture is improved, the theory of Metadata Management is combined with and the method of capture Metadata information from ETL conceptual model is used. At last, a new ETL logical architecture is presented based on Metadata-driven.Finally, a system for generating Metadata is designed to solve the extraction, production, storage and management of ETL Metadata. It also could realize the description, management and use of ETL model. The system's constitutes modules, it's design flow and technology for implement are introduced, as well as the operation results of the system is analyzed and evaluated.
Keywords/Search Tags:Data warehouse, Extract-Transform-Load, Metadata, DSA, Structure graph, Conceptual model
PDF Full Text Request
Related items