Font Size: a A A

Rough Sets In Knowledge Discovery

Posted on:2007-08-24Degree:MasterType:Thesis
Country:ChinaCandidate:F Q LiuFull Text:PDF
GTID:2208360182481298Subject:Industrial Economics
Abstract/Summary:PDF Full Text Request
With the development of database technology and the coming of information era, largeamount of data is accumulating in many industries. The volumes of the database are zooming.In order to improve the efficiency of work and quality of life, people have to derive valuableknowledge embedded in data from databases. For the aim, people have begun the research onknowledge discovery in databases. As we all know, however, usually there are redundant data,missing data, uncertain data and inconsistent data in the databases and they become a great barrierto extracting knowledge from databases.Rough Sets (RS) theory was put forward by Pawlak Zdzislaw in 1982. After more than 20years of developing, it has received fruitful achievements in both of theory and applications. RSdoesn't depend on additional information beyond the data set, which is a potent tool for dealingwith vague, imprecise, incomplete and uncertain data. Some traditional method of knowledgediscovery is only suitable for precise set, not for rough set. Since many set of data in real life isrough, the model of knowledge discovery based on Rough Sets Theory plays an important role ininformation system.Firstly, the history, status and possible development direction of KDD are introduced and themain methods and techniques of KDD are also reviewed. Secondly, the rough sets theory isintroduced and general application procedure of rough sets theory in KDD is analyzed. Then theauthor's research on data discretization is introduced in detail. In which the general mathematicalpresentation of definition of discretization on continuous data, and by modifying the local methodthat is based on the MDLPC criterion with the help of rough sets theory, a global discretization isproposed. It makes the MDLPC method globalized by introducing inconsistency checking basedon rough sets theory to preserve the fidelity of the original data. Then the reduction of cut points isperformed, which will not change the consistency level and lead to small size learning model.Finally, a KDD Model Based on Rough Sets Theory is brought forward, and used to diagnose theCleveland Heart Disease. The experiment results show that the model is advanced and practical.
Keywords/Search Tags:Knowledge Discovery in Databases, Rough Sets, Discretization
PDF Full Text Request
Related items