Font Size: a A A

Research On Decision Tree Improvement Based On Rough Set Theory

Posted on:2010-09-13Degree:MasterType:Thesis
Country:ChinaCandidate:J L ZhengFull Text:PDF
GTID:2178360275477638Subject:Computer application technology
Abstract/Summary:PDF Full Text Request
Classification is one of importance tasks in data mining, and decision tree is one of the models that are often used in classification. It has been widely researched and applied since it was proposed in 1966. But the decision trees are always tends to be over-fitting, to have larger scales and to induce longer classification rules in that the tree induction algorithm adopts greedy method. Many methods are proposed to improve these flaws mentioned above. In this dissertation these methods are studied completely and sequentially a new method to optimize decision tree based on rough set is put forward.There are four main points in this dissertation as follows:(1) Introduces KDD involved definition, basic process, application field and the main issues. Subsequently, introduces the applications of decision tree, the classification models which used usually and the classical classification models based on decision tree.(2) A detailed survey of all the decision tree optimization approaches is given, such as modifying test space, modifying test search, tree pruning, restricting database and alternating data structures. Each kind of classical algorithms is also analysed and the advantages and the disadvantages of each kind are also given.(3) The advanced multivariate decision tree algorithm named VPMDT is proposed based on the deficiencies of the multivariate decision tree algorithms which have proposed before. The tree scale of model constructed by VPMDT algorithm is less than other algorithm's in that it selects the reasonable attribute combination as the splited attribute. Through experiment, it demonstrates that our algorithm can perform well, and the predictive accuracy is advisable.(4) Based on the research above, an experimental system is carried out. The system reads data from text file and then constructs the tree model. It can compute the predictive accuracy of the model. We can use it to compare the performance of the decision algorithms which embeded in our system.
Keywords/Search Tags:data mining, classification, decision tree, rough set
PDF Full Text Request
Related items