Font Size: a A A

Based On The Improvement Of The Decision Tree Id3 Algorithm

Posted on:2010-01-31Degree:MasterType:Thesis
Country:ChinaCandidate:H LiFull Text:PDF
GTID:2208360275982870Subject:Software engineering
Abstract/Summary:PDF Full Text Request
Data mining is a process of using analytic tools from the datas, which are massive, noisy, fuzzy, incomplete and random. Using this method, we can find the latent useful knowledge and information which used to be concealed and unknown beforehand. And we can establish the data relational model to forecast the future. Classification mining is one of the most important techniques in data mining, as well as an important topic in the study. And the decision tree method is the focus in the research of the classification. It can directly reflect the characteristic of the data besides to be easily understood. Moreover, the decision tree model has the good ability to classify and predict. And we can draw the decision rule conveniently by using it.At present, many scholars have put forward a large number of algorithms using decision tree to assort for cosmical data collections, of which ID3 algorithm is the most typical one advanced by Quinlan in 1986. But this algorithm has two major shortcomings: one is that it is biased in favor of those attributes whose values is more, while attributes whose values is more are not always the best; the other is that it can only deal with discrete attributes but not continuous attributes.To solve these problems, by using the concept of information gain ratio and separating the value of the continuous attributes into two rangs, this thesis introduces a new improved algorithm in the bsae of ID3 algorithm. According to the object-oriented method, this thesis uses Java to actualize ID3 algorithm and the improved algorithm. According to the emulational experiment we can conclude that the decision tree built by the new algorithm is better than by ID3 algorithm. Besides, XML is used in the improvement and realization of ID3 algorithm. Based on"XML can express all kinds of data, exchange different kinds of data and solve the problem of united interface, it can provide one method of data mining in transforming arbitrary database into XML format.
Keywords/Search Tags:Data Mining, Decision Tree, ID3 Algorithm
PDF Full Text Request
Related items