Font Size: a A A

Study Of Data Mining In Incomplete Information Systems Based On Rough Set Theory

Posted on:2008-02-12Degree:MasterType:Thesis
Country:ChinaCandidate:J ZhangFull Text:PDF
GTID:2178360242956890Subject:Probability theory and mathematical statistics
Abstract/Summary:PDF Full Text Request
With the extensive application of massive databases and fast development of Internet,the storage in databases is increasing rapidly in the world. So it is one of the major researchsubjects to mine potential and valuable information from vast and various data (namely DataMining) in knowledge discovery in these days. The rough set theory, introduced by Pawlak in1982, is a new mathematical tool to deal with vagueness and uncertainty. Knowledgereduction is one of the important contents in the research on rough set theory. Reduction isused to decrease the dimension of structured data and the various compact degrees of datasets are obtained. It is one of important tasks in the research on Data Mining. Because ofgeneral phenomena of indefinite data or even imperfect existing, information systems that arepresented to user are mostly incomplete. Simultaneously, the classical rough set theory isbased on complete information systems. Therefore, it is most significant to investigate how toobtain knowledge from incomplete information systems. In this paper, Knowledge Discoverybased the rough set theory under incomplete information systems is studied. The maincontents are presented as following:Firstly, it is analyzed data missing problem in the process of data mining. Advantagesand disadvantages are compared in dealing with incomplete information systems.Secondly, distribution reduction, maximum distribution reduction and assignmentreduction are introduced into the value-set information systems. On this condition, it presentsthe general and heuristic algorithms for assignment reduction and general algorithm formaximum distribution reduction and analyzes their time complexes. Through changingincomplete information system into value-set system, attribute reduction of incompleteinformation system is effectively resolved.Thirdly, under tolerant relation and half-order relation, algorithms for reduction arevalidated by experiments. It shows that these algorithms can find corresponding reductionresults.Fourthly, design a model of data mining system for incomplete information system.
Keywords/Search Tags:data mining, rough set, incomplete information system, value-set information system, attribution reduction
PDF Full Text Request
Related items