Font Size: a A A

Research Of The Generalized Rough Set Model And Its Application In Data Mining

Posted on:2003-03-13Degree:DoctorType:Dissertation
Country:ChinaCandidate:Y SaiFull Text:PDF
GTID:1118360092998852Subject:Management Science and Engineering
Abstract/Summary:PDF Full Text Request
Data mining and knowledge discovery in databases is a new technology for drawing knowledge from data. As an effective approach to processing incomplete, imprecise or uncertain information, rough set has been playing an important role in the area of data mining and knowledge discovery. The successful applications of rough set approach depend largely on the completeness of its theory. Only after the theory is systematically and deeply studied, can rough set be applied to practical domains. Taking as a background the project of the research of new technology for data warehouse and data mining in the area of management decision support, which is granted by the National Nature Science Foundation, this thesis addresses completely and systematically the main research contents and methods of rough set from the aspects of both theory and application. The primary contributions of this thesis include:Having studied the rough set theory carefully, the thesis finds the relationship between rough set and such abstract theories as modal logic, fuzzy set, algebra system and interval-set algebra, i.e., rough set provides a well defined semantical interpretation to these abstract theories, which enables us to better understand them. Secondly, rough set builds an inter-relationship among these dependent theories and connects them together.The thesis studies the generalized rough set models and proposes a multi-level rough set approximation model CBM-RS based on a covering of the universe. We validate that the multi-level rough set approximation induced by a sequence of reflexive relations is a special case of CBM-RS. CBM-RS model breaks through the limit of the multi-level rough set models induced only by binary relations. The thesis also examines the rough set model based on classification accuracy. The MIE-RS data mining approach given later is based on the model.The thesis proposes a rough set approach to mining minimal rules in inconsistent decision tables MIE-RS. We deal with the inconsistency through classification accuracy, using heuristic algorithms we can get a set of minimal productive rules satisfying the given classification accuracy. With respect to the implement of our algorithm, we construct two Hash functions to reduce the time complexity. Several UCI data sets are used to test the approach. Compared with Rosetta toolkit, our method increases the data reduction rate greatly and simplifies the result rules effectively.The thesis proposes a data analysis and data mining model in ordered information tables OITM. Ordering of objects is a fundamental issue in human decision making and may play a significant role in the design of intelligent information systems. This problem is consideredfrom the perspective of data mining. The commonly used attribute value approaches are extended by introducing order relations on attribute values. We generalize the notion of information tables to ordered information tables by adding order relations on attribute values. A data analysis method is thus proposed to describe properties of ordered information tables. We define the concepts of reduct and core etc. by analyzing the attribute dependency in ordered information tables; The thesis also proposes and formalizes the problem of mining ordering rules, designs the ordered decision logic language (ODL-language), and gives a solution for mining ordering rules. Mining ordering rules based on ordered relations is a concrete example of application of generalizations of rough set model with non-equivalence relations. The achievements of this thesis have great theoretic and realistic significance in expanding the rough set theory and its application in data mining.
Keywords/Search Tags:rough set, data mining, generalized rough set approximation model, ordering rule, ranking
PDF Full Text Request
Related items