Font Size: a A A

Research On Framework Of Ontology Based Data Cleaning System

Posted on:2009-04-29Degree:MasterType:Thesis
Country:ChinaCandidate:L C ZhangFull Text:PDF
GTID:2178360272477166Subject:Computer application technology
Abstract/Summary:PDF Full Text Request
With the rapid development of database technology and the diversification of ways for getting data, the categories of data are increasing rapidly and the amount of data is increasing dramatically.The value of data lies in the quality rather than the quantity, and the decision based on bad data is unbelievable. The huge and chaotic poor data has become a"bottleneck"in data application.As a primary method, data cleaning has become a hotspot to resolve the data quality problem.However, most of the current researches are based on the text value but the latent semantic of the data.How to introduce the semantic to the current researches is becoming a new hotspot.Data cleaning and its semantic are studied in this dissertation, and the main contributions are as follows:Firstly, the data quality and data cleaning under the background of the information construction are researched in this dissertation. According to the analysis of the domestic and foreign researches in this field, the weaknesses of current researches are summarized. Then the ontology and its critical technology are introduced to resolve them, meanwhile the argumentation of this method is given.Secondly, the researches of knowledge and its expression method, ontology and its critical technology, are summarized in this dissertation and used as the theoretical principle of our research.Thirdly, a data cleaning system framework based on ontology is proposed in this dissertation. According to the characteristics of resource description, the system framework is divided into the ontological expression model and dynamic processing model, which describe static semantic information and processing semantic information respectively. Meanwhile, the formal description of every component of the model, the working principle and implementation mechanism in process of main modules are also given respectively in this dissertation.Finally, the data cleaning system framework is designed and implemented in this dissertation under the analysis of both semantic models. The static structural designs and dynamic behavior semantics are modeled with UML.And the framework resolves the lack of semantic restriction and automated reasoning in current research.
Keywords/Search Tags:Data Quality, Data Cleaning, Ontology, Cleaning Rule, Task Structure, Framework
PDF Full Text Request
Related items