Font Size: a A A

Research On An Approach Of Incomplete Information Processing Based On The Rough Set Theory

Posted on:2008-06-08Degree:MasterType:Thesis
Country:ChinaCandidate:Z M ZhangFull Text:PDF
GTID:2178360212975629Subject:Computer system architecture
Abstract/Summary:PDF Full Text Request
In the process of Knowledge Discovery in Databases, people often face incomplete information system, that is, a substantial proportion of the data may be missing in real-world applications. It is very important to deal with incomplete data, because it may lead to confusion and irresponsible outputs in data minning. As a new mathematical tool for dealing with inexact, uncertainty or vague knowledge, the rough set theory has got great success in KDD in recent years, and the most prominent advantage is that, it needs only the data provided in the information systems, relying on no other model assumptions. At present, the theoretical frame of KDD in incomplete information system based on rough set theory is basiclly completed, but the variety and quality of knowledge extracted is still need to be improved.The main work of this paper is to give in-depth study on the processing method of incomplete data problem using rough set theory, to improve the quality and effiency of KDD. There are two methods of KDD in incomplete information systems: one is to complete the incomplete information system first, and then extract knowledge based on the completed system; the other is to extract knowledge directly from the incomplete information system with no change on original system. This paper starts with this two kinds method, provides two new algrithms under rough set theory to improve the KDD performance. Firstly, this paper analyzes the limitation of data filling algorithms in existence, extends the valued tolerance relation matrix in rough set theory, introduces divide-and-conquer idea, and then provides a new algorithm RSDIDA. The experimental result demonstrates that it improves the filling ratio and efficiency greatly. Secondly, this paper provides an optimized knowledge reduction algorithm for incomplete information system with no change on original system. It first analyzes the limitation of traditional rough entropy and correlative knowledge reduction algorithm, extends the incomplete entropy, which describes the uncertainty of knowledge more precisely, and then uses rough entropy and new incomplete entropy to define attribute significance. Accordingly, the new knowledge reduction algorithm is provided. The example analysis proves its validity.
Keywords/Search Tags:Knowledge discovery, Rough set, Incomplete information system, Data filling, Knowledge Reduction
PDF Full Text Request
Related items