Font Size: a A A

A Study Of Optimizing Data Mining Algorithms Based On Decision Tree

Posted on:2006-11-07Degree:MasterType:Thesis
Country:ChinaCandidate:W WangFull Text:PDF
GTID:2168360155455020Subject:Computer application technology
Abstract/Summary:PDF Full Text Request
Data mining is a method of using analytic tools from the data, which are massive, incomplete, noisy, fuzzy and random. Using this method, we can find the latent useful information and knowledge which used to be concealed and unknown beforehand. And we can establish the data relational model to forecast the future. The decision tree model, which can directly manifest the characteristic of the data besides to be easily understood, is the most frequently adopted in the data mining. Moreover, the decision tree model, owning the ability of classification and prediction, can draw the decision rule conveniently.The decision tree's formulation process is also the knowledge discovery process. And the decision tree's complexity and the predict precision determine the quality of the decision tree. The decision tree's formulation process is according to the inspiring rule. The ID3 and C4.5 algorithms based on the information theory and the CART, SLIQ and PUBLIC methods based on the lowest GINI index are very common in the decision tree's building. The building of an optimum decision tree proved to be a NP question. At present, we have introduced some new technology and methods to be the inspiring rule, such as the genetic algorithms, the correlation analysis and so forth. Furthermore, we have simplified the existing inspiration rule computation and discussed the completeness of preliminary decision rule.In order to find out the decision rule from a large number of attributes, the thesis introduces the rough set to simplify the test attributes, and find out the attributes which really influence decision to reduce the decision tree's scale. The thesis uses the similar degree of attribute between the test and the decision to be the inspiring rule to produce the decision tree. In the system of the university teacher's synthetic comparison, we adopt the new algorithm to build a decision tree. And the experimental result indicated that the forecast precision of the new algorithm was better than ID3, and the computation was simpler as well. Finally,...
Keywords/Search Tags:Data mining, Decision tree, ID3 algorithm, Rough set, Similar degree of attribute
PDF Full Text Request
Related items