Font Size: a A A

PAC learning a decision tree with pruning

Posted on:1993-09-04Degree:Ph.DType:Dissertation
University:University of FloridaCandidate:Kim, HyunsooFull Text:PDF
GTID:1478390014495365Subject:Business Administration
Abstract/Summary:
This dissertation investigates various theoretical effects of pruning a decision tree. Empirical results have shown that pruning can improve the accuracy of an induced decision tree. Pruning also leads to more concise rules. First we provide a pruning algorithn based on the rank of a decision tree. A bound on the error due to pruning by the rank of a decision tree is determined under the assumptions of an equally likely distribution over the instance space and a deterministic tree labelling rule. This bound is then used with recent results in learning theory to determine a sample size sufficient for Probably Approximately Correct (PAC) identification of decision trees with pruning. We also discuss other pruning rules and their effects on the error due to pruning. With a nondeterministic tree labelling rule, we show that the upperbound of the average pruning error is less than or equal to 0.5 under an equally likely distribution.;In a realistic learning environment it is often not possible to obtain a large enough sample to guarantee PAC learning. For those cases, we provide several methods for a posterior evaluation of the accuracy of a pruned decision tree. We give a method which estimates a lower bound for the worst possible confidence factor, ;Finally we develop conditions under which pruning is necessary for better prediction accuracy as well as for concept simplification. We give an analysis of the reason why pruning is necessary in realistic learning situations.;We generalize a previous result for larger training sets. A Bayesian analysis shows the the average prediction accuracy of the pruned tree increases, and the effect of description noise becomes stronger as the size of the training set increases. For very large training sets, the pruned tree has the prediction accuracy equal to that of the unpruned tree.
Keywords/Search Tags:Tree, Pruning, Prediction accuracy
Related items