Font Size: a A A

Design And Implementation Of ETL System On ODS In The Insurance Industry

Posted on:2009-01-26Degree:MasterType:Thesis
Country:ChinaCandidate:C G HeFull Text:PDF
GTID:2178360275970200Subject:Computer technology
Abstract/Summary:PDF Full Text Request
This paper has analysed ETL-related research and techniques, based on a practical project of a certain insurance company, and has done a lot of research on design and implementation of ETL. The ETL system had been put into production environment.On modeling design architecture, firstly according to the project this paper puts forward design model architecture based on the common data warehouse model. Then on the basis of it this paper has set up job schedule Meta model by means of analyzing logic.On extracting data, this paper brings forward an extraction-transfer-staging-merge approach to solve the issue of extracting and merging data in the distributed heterogeneous environment.On the ETL system performance, this paper has implemented boosting the performance by using pipelining and partitioning thought.On the conforming duplicate customer data, firstly this paper puts forward the sorting and equal matching algorithm. Then, in the situation when the matching keys exist, this paper proves the performance is effective. This paper makes use of business rules to bring forward the algorithm of processing the duplicate data, which hasn't been demonstrated on this point before.On detecting error data, this paper puts forward an approach that business rule objects are used for detecting error data expression. This paper proves that the approach is effective and efficient.On data quality, by fully using a set of quantity system this paper conforms to the data quality dimension and its importance weight, and by which the approach to weighted average data quality is used to evaluate data quality of system as a whole. This paper makes data quality one part of ETL design, which enhances feature of ETL design model and higher usability.
Keywords/Search Tags:ETL, Meta model, data quality, erroneous data detection, duplicate data conforming
PDF Full Text Request
Related items