Font Size: a A A

Research And Implementation Of Data Cleansing Framework Based On Component

Posted on:2009-10-12Degree:MasterType:Thesis
Country:ChinaCandidate:Z LiFull Text:PDF
GTID:2178360308479373Subject:Computer software and theory
Abstract/Summary:PDF Full Text Request
The appearance of large-scale and interdisciplinary data warehouse brings the increasing data quantity in each data warehouse and more complex data mode, the design of cleansing process also has been more complicated. These changes result in many new requirements in the development of data cleansing software, such as dynamic creation, frequent amendment and more user interaction. Thus, how to design a reusable data cleansing process for meeting the new requirements has been a new challenge to the designers and developers of data cleansing software.In this thesis, the logical model and physic implementation based on component of data cleaning in-depth are studied for solving above problem. The main work is done in two parts as follows:In first part, DCPM (Data Cleansing Process Model) is proposed and the feasibility of modeling data cleansing process by workflow modeling technology is proven, which enable data cleaning process comply with agreed module by mature modeling technique to model. This module describes the attribute and relation among the inner element of data cleansing process flow. Modeling data cleansing process based on the uniform module increases reusability of the process.In second part, through analyzing the new requirements in present design and development of data cleaning software as well as the deficiencies in development method, C+ADC (Component-extended Agile Data Cleaning) is proposed and implemented including a Run-Time platform and a framework service component set. As it is easy and flexible for this frame to construct data cleaning application based on extended component, it reduces the cost and cycle of development a lot. Moreover, the component module integrated C+ADC framework as well as the map strategy from the business space of data cleaning process to the component is defined in order to implement the data cleaning application preferably. A practical development case has proven that development of data cleansing application based on DCPM (Data Cleansing Process Model) and C+ADC (Component-extended Agile Data Cleaning), can construct quickly a flexible and extendable component-based data cleansing application software.
Keywords/Search Tags:data cleansing, componentization, data warehouse, process model, reusability
PDF Full Text Request
Related items