Font Size: a A A

Knowledge Reduction Algorithms And Application Based On Rough Sets

Posted on:2009-07-08Degree:MasterType:Thesis
Country:ChinaCandidate:Y YanFull Text:PDF
GTID:2178360272957350Subject:Detection Technology and Automation
Abstract/Summary:PDF Full Text Request
Rough Set (RS) theory, introduced by Pawlak Z, is a novel mathematical tool to deal with vagueness and uncertainty. It is a powerful mathematical tool for analyzing uncertain, fuzzy knowledge and can effectively deal with the imprecise, incomplete, or uncertain data. Now it has attracted much attention of researchers around the word. In recent years, it has been successfully applied to data mining, machine learning, knowledge discovery from database, decision support systems, fault diagnosis etc.This article emphatically studies on one of the important problem of Rough Set theory—the reduction of the decision table. Attribute reduction preserves the original meaning and reduces the irrelevant and unimportant knowledge. The details are studied as follows:In regard to a complete and discrete information system, consider attribute reduction in the view of information theory. A developed attribute importance measure method is defined based on the mutual information between selected attribute and decision attribute, and the measure is used as the heuristic information in the proposed algorithm. Conditional information entropy is used to compute relevance of attributes and it is used in fitness function of genetic algorithm to assure reduction has few attributes and relevance between attributes.Traditional Rough Set theory is generally incapable of handling incomplete information system. After studying the extensions of Rough Set model, point out their shortages. For essentiality of attribute existing difference, a developed attribute importance measure method is defined based on the difference degree of attributes. It's proposed an attribute reduction algorithm based on connection degree of essentiality of attribute. An example shows that the proposed algorithm is an effective method.Another the Rough Set theory defect which blocks its development and application is that it can not be employed on continuous values directly. Previously discretization method is applied beforehand in order to transform the data into discrete values, but this may result in information loss. The notions of similarity between objects and improved general important degree of an attribute are introduced. The global similarity measure between objects is defined by them. A direct reduction method is applied to continuous attributes using tolerance relation by the global similarity relation. This method avoids losing the information in the data's discretization progress. Finally, the method is applied to fault data, and the result shows that the method is effective.
Keywords/Search Tags:rough set, information system, heuristic algorithm, attribute reduction, genetic algorithm, information theory, global similarity relation, fault diagnosis
PDF Full Text Request
Related items