Font Size: a A A

Research On Knowledge Acquisition For Decision Information System Based On Rough Set Theory

Posted on:2007-06-08Degree:DoctorType:Dissertation
Country:ChinaCandidate:B B QuFull Text:PDF
GTID:1118360242961877Subject:Computer software and theory
Abstract/Summary:PDF Full Text Request
Along with rapid development of technology of database & data warehousing, large amount of data is stored. The aspiration for extracting hidden useful knowledge from the data set outpaces the traditional methods of data analysis. How to rapidly acquire valuable knowledge in the real-world databases which tend to be very large, redundant and noise and how to forecast forthcoming behaviors. This leads to the emerging field of knowledge discovery in databases.With the difference of fuzzy set theory and dempster-shafer evidence theory, rough set is a new mathematical tool dealing with vagueness and uncertainty. Its uppermost character is that rough set could find out and analyze hidden knowledge in the data set while it doesn't need any prior or additional information about data. So rough set has many advantages as tool of knowledge discovery. In rough set theory, knowledge is expressed in information table or decision table.Attribute reduction is the main process of knowledge acquirement based on rough set. Being analyze the attribute reduction algorithms of consistent decision table, the reason of inefficiency was found. Then a new algorithm is proposed which adopted hiberarchy structure and boundary of attribute as heuristic function to choose the essential attribute. The algorithm can select the important attributes those reflect the characteristic of system while the universal is decreasing. The experiment results show that on the premise of unchanging of classification precision, the algorithm can find best or sub-best attribute reduce set.Decision rule set is the motive of reduction. Ordinary attribute value methods mainly deal with compatibly samples but not incompatibly ones. With the two concepts: confidence and coverage, an adaptive default rule generation algorithm is proposed. As a threshold, the local minimal confidence and coverage of complete decision table are calculated in order to control the number of rule set generated. Finally, certainty rules which are satisfied minimal coverage threshold and possible rules which are satisfied minimal confidence threshold can be extracted. The experiment results indicate that the algorithm can effectively eliminate the redundancy rules caused by noises, achiceving that it will and get more compact rule set.The results that are acquired from decision table are applied for induction. The induction effect lies on how many induction conclusions are right. With the point of information theory, a reasoning method with a rule-choosing stratagem of high rule information first is developed. Compared with other strategy, the rule set induced by adaptive default rule generation algorithm appear more excellent under the rule-choosing stratagem of high rule information first.In practices, because of the error of data measuring, the limitation of comprehension of data or the limitation of acquiring data, etc, incomplete information systems with missing values often occur in knowledge acquisition. The application of rough set theory in incomplete information systems is one of the key problems to study rough set theory in practice. However, it is difficult for classical rough set theory to deal with those incomplete information systems. After analyze some kinds of extensions to the classical rough set theory, a new extension of rough set model based on limited non-symmetric similarity relation is developed. It inherits the merit of the other extensions of classical rough set models and avoids their limitations. Also it accords the practice and it is more suitable for processing incomplete information systems.Traditional rule generation algorithms are based on complete data set. For the incomplete information systems, classical discernibility matrix is extended based on limited non-symmetric similarity relation model. Using the method of Boolean reasoning, precise and simple rules can be extracted directly from the incomplete decision table without changing the size of original incomplete system.At the same time, the rules with high confidence are not affected by the missing values.
Keywords/Search Tags:Rough set, Hiberarchy reduction, Adaptive default rule, Limited non-symmetric similarity relation model, Extended discernibility matrix
PDF Full Text Request
Related items