Font Size: a A A

Rough Set Approach To Data Mining In Incomplete Information Systems

Posted on:2006-08-07Degree:MasterType:Thesis
Country:ChinaCandidate:M L LiangFull Text:PDF
GTID:2168360152494364Subject:Computer application technology
Abstract/Summary:PDF Full Text Request
Missing or incomplete data are a major concern in data mining both because a substantial proportion of the data may be missing in real-world applications and because poor methods for incomplete data will bias the results of data mining. In addition, it is of great difficulty for data mining in an incomplete information system, which contains more uncertainty than a complete one does. This paper applies rough set theory -a mathematical tool for dealing with inexact, uncertain or vague knowledge-to handling incomplete data in data mining, so as to reduce the large gap between the available data and the machinery available to process the data.In the paper, the main issues related to the incomplete data problem are detailed first. And the commonly-used methods of handling incomplete data problems in data mining are reviewed, with a discussion about a number of their known strength and weakness. Then the theory of rough set is introduced. Several extensions ofrough set in incomplete information system are carefully studied and the performance of these extended models are compared, based on which an algorithm of optimal decision rules generation is presented and proved, and a new extension of rough set based on the r limited tolerance relation and knowledge reduction methods in it are proposed. Finally a model of a data mining system under incomplete information is given.
Keywords/Search Tags:data mining, rough set theory, incomplete information system, limited tolerance relation, missing data imputation
PDF Full Text Request
Related items