Font Size: a A A

Research On The ID3Algorithms Of Decision Tree

Posted on:2015-06-16Degree:MasterType:Thesis
Country:ChinaCandidate:X ZhangFull Text:PDF
GTID:2298330467451350Subject:Control theory and control engineering
Abstract/Summary:PDF Full Text Request
As one of the most effective methods for dealing with big data, data mining technology can obtain valuable information from massive data. Being one of the most important subjects of data mining, classification technique is widely used in scientific research and business intelligence activities. Furthermore, decision tree technique is one of the most important methods of classification techniques. During the past50years, many intelligent decision tree construction algorithms have been proposed. The ID3(Iterative Dichotomize) algorithm is one of the most representative methods, which is the basis of many other decision tree methods. The ID3algorithm has attracted increasingly growing interests from researchers both at home and abroad for its many superior characteristics such as clear, simple, convenient to implement, easy to understand, better classification results, and so on. However, the ID3algorithm also has many drawbacks, such as choosing those attributes with more attribute values when choosing the split property, no optimization to trees when constructing, and the logic expression needs to be strengthened etc.This paper aims to study ID3algorithm and the main work and innovation of this paper is summarized as follows.Firstly, this paper analyses the reason of the multi-value bias problem of ID3algorithm theoretically. In the Process of theoretical analysis, this paper attempts to make innovation in the following two aspects.1) The concept of attribute importance is introduced based on rough set theorem;2) We analyze the influence of attribute’s multi-value on the other attributes.Secondly, a novel modified decision tree algorithm is proposed, which is defined as SID3algorithm. For the lack of the ID3algorithm, The New SID3algorithm introduces a function related the number of the attribute’s value, simplifies the expression of ID3algorithm and adds pruning technology. The experiment shows that SID3algorithm overcomes the lack of multi-value bias, reduces the amount of calculation, enhances the algorithm logic, increases the pruning techniques to optimize decision tree constructed. Overall, using SID3algorithm to construct a decision tree, it will be faster, more reasonable in tree shape and higher in accuracy rate.Finally, on the visual Studio platform, with C sharp programming language, a decision tree intelligent system constructed based the ID3algorithm and SID3algorithm is realized. In addition, we attempt to implement it to the digital medical diagnostics.
Keywords/Search Tags:data mining, decision tree, ID3algorithm, multi-value bias, SID3algorithm, medical diagnostics
PDF Full Text Request
Related items