Font Size: a A A

Decision Tree Learning Based On General Entropy And Unstable Cut-points

Posted on:2011-05-13Degree:MasterType:Thesis
Country:ChinaCandidate:H Q ZhaoFull Text:PDF
GTID:2178360308454096Subject:Computer application technology
Abstract/Summary:PDF Full Text Request
Classification problem is one of the most important learning problems in machine learning, and decision tree learning algorithm is a kind of typical and well-known classification learning algorithm. In recent years, a great progress has been made in the research and improvement of decision trees with both symbol-valued and continuous-valued attributes. Some researchers have proposed different kinds of attribute selection criteria for the Continuous-valued decision tree learning algorithms. However, most of the studies only analyzed and compared the strengths and weaknesses of these attribute selection criteria, and did not study their common characteristics. Based on the situation, we have made three main contributions in this research:Firstly, the thesis describes the generation process of continuous attribute decision trees. Considering the properties of two classical heuristics (i.e., Information Entropy and Gini-Index), the concept of general entropy is defined and a new attribute selection criterion called the partition general entropy is further proposed. Secondly, because of high computational time complexity in the generation process of the decision trees with continuous-valued attributes, we introduce the concept of unstable cut-point, and then it is analytically proved that the partition general entropy can always be minimized at these unstable cut-points. This implies that the computation on stable cut-points may not be considered during the tree growth. The experimental results also show that the proposed method can reduce the computational complexity greatly and thus improve the algorithm's efficiency. Finally, the method of hypothesis testing is applied to prove that, using the proposed general entropy as the attribute selection criteria to generate decision trees, there is no obvious difference in the extension ability of the general entropy at a certain level of significance.
Keywords/Search Tags:Decision tree, Information entropy, Gini-Index, General entropy, Unstable cut-point, Hypothesis testing
PDF Full Text Request
Related items