Font Size: a A A

Research Of The Multivariable Decision Trees Based On Rough Set And Its Application

Posted on:2006-07-27Degree:MasterType:Thesis
Country:ChinaCandidate:Y L JiaFull Text:PDF
GTID:2168360152990218Subject:Computer application technology
Abstract/Summary:PDF Full Text Request
Classification is an important problem in data-mining. One of the important classification techniques is decision trees. Why decision trees? Firstly, a decision tree is easily comprehended by humans, and decision trees is efficient and is thus suitable for large training sets ;Secondly, decision tree generation algorithms do not require additional information besides that already contained in the training data; Finally, decision trees display good accuracy as compared to other techniques. But decision trees also have some limitation. On the one hand, it can not delete irrelevant attributes ; One the other hand, most decision trees can test only one attribute on each node.In order to overcome these limitations, we introduce Rough Set(RS) techniques . Rough Set is a new mathematical tool to deal with fuzzy and uncertain knowledge. It has strong knowledge obtaining ability. The main point of rough set is that it can integrate knowledge with classification , and that knowledge is the ability of classifying the objects. Although it is effective in dealing with the imperfect knowledge, it is weak in tolerance and generality, that is to say, it needs to integrate with other technology.Considering the advantage and disadvantage of the decision trees and RS , we combined the decision trees and RS during the research,so it can overcome one's weaknesses by acquiring other's strong points. For only having discrete attributes, We proposed a new condition attributes deduction algorithm , which takes into account the core of condition attributes with respect to decision attributes in rough sets theory and the classification ability of condition attributes and the scale of the decisions trees to be built.We prove that the accuracy of decision trees, constructed by the improved algorithm, is equal to that of ID3 algorithm, and that the new algorithm is more eficient than ID3 algorithm.At last, According to the new algorithm framework we designed a KDD system which completed pre-processing program based on RS , and completed the classification based on improved decision trees algorithm and prediction model of the algorithm .We successfully applied the KDD system in classifying the data of derived from the first day (front page) medical record of inpatients with cardiovascular diseases.
Keywords/Search Tags:Data Mining, Rough Set, decision trees, ID3
PDF Full Text Request
Related items