Font Size: a A A

Application Of Artificial Intelligence On Data Cleaning

Posted on:2007-01-26Degree:MasterType:Thesis
Country:ChinaCandidate:Y ChenFull Text:PDF
GTID:2178360185997210Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
With the business growth rapidly, huge amount of data comes into being constantly in manufacture management, technology quality, and financial cost areas. It is a question to discussion on how to make good use of these data and system, improve data quality, provide correct data for Decision Support System and abstract data from information and knowledge.Data cleansing, also called data scrubbing, deals with detecting and removing errors and inconsistencies from data in order to improve data quality.This paper bases on the data quality issues and customer's especial requirements of CATT project, builds metadata model for data cleaning method and brings a creative idea which import artificial intelligence method to data cleaning method.The metadata is divided into two parts, logic and information data, in this project. Logic data is on behalf of rule. It means the detail way on how to process data which is designed and implemented by designer. Information could be updated automatically by program. The cleaning operation of data and utilize rule to do validation on cleaned data all belongs to logic. Moreover, dirty data, cleaned data, result of validation and data dictionary used by cleaning procedure are all information.The artificial intelligence module uses Bayes text identify method in this paper. It identifies if the special field according with correct characteristic via Na?ve Bayes category algorithm. Integrating artificial intelligence module into whole data cleaning procedure, and notifying...
Keywords/Search Tags:Data Cleansing, Data Quality, Machine Learning, Bayes, Meta Data, ETL, Data Warehouse
PDF Full Text Request
Related items