Font Size: a A A

Design And Implementaion Of An Sca-Based ETL Architecture

Posted on:2015-01-31Degree:MasterType:Thesis
Country:ChinaCandidate:X H YiFull Text:PDF
GTID:2268330425981905Subject:Computer application technology
Abstract/Summary:PDF Full Text Request
The Information Systems are essential to the modern enterprise business system, which stores valuable corporate asset. With the competitive market companies have to turn towards the online analytical processing and data mining business intelligence systems from the traditional online transaction processing systems and the office automation business systems and data warehouses are based on data business intelligence systems. According to statistics, in the data warehouse construction,60%to80%of the development cycle and a third of the project cost are spent on ETL, making ETL become the bottleneck of data warehouse project.According to the bottleneck problem of ETL development cycle, this thesis proposes a SCA-based ETL architecture. The main ETL problems to be solved are divided by finer grained methods. They adopt more fine-grained components to achieve the ETL process, select the appropriate way based on the characteristics of different issues, and then use SOA architecture design ideas and the best method of their SCA to integrate the implementations of these fine-grained components by SCA container. This framework divides the implementation of ETL into four coarse-grained components, namely metadata components, common data source components, data quality components and dimensional modelling components. Metadata components and common data source components belong to basic components, which are called by data quality components and dimensional modelling components. Each coarse-grained component includes a plurality of fine-grained components to achieve specific functions.In this thesis, with the development of a practical project, it gives an introduction of the existed problems while implementing the ETL process based on a single ETL tool, such as changes of the data source, the system upgrading and higher requirements put forward by customers, etc. Afterwards, it demonstrates that the SCA-based ETL architecture can solve these problems with a great flexibility. Meanwhile, it proves that the SCA-based ETL architecture is able to reduce the development cycle and has the actual applied value. Finally, it analyzes the advantages and disadvantages of SCA-based ETL architecture and its application scenarios.
Keywords/Search Tags:ETL, data quality, service component architecture, Web service
PDF Full Text Request
Related items