Extended Bayes and skewing: On two improvements to standard induction-based learning algorithms

Posted on:2006-12-09

Degree:Ph.D

Type:Thesis

University:Polytechnic University

Candidate:Rosell, Bernard

Full Text:PDF

GTID:2458390008470012

Subject:Artificial Intelligence

Abstract/Summary:

We address improvements to Naive Bayes (NB) and Decision Trees, two standard induction-based methods for solving classification problems. The goal of these improvements is to extract more information from the training examples, in order to more accurately classify new examples.; The first part of this thesis presents a new learning algorithm, Extended Bayes (EB), which is an extension of NB. NB classifies new examples using conditional probabilities computed from the training data. It is simple, fast, and widely applicable. EB retains these positive properties of NB, while equaling or surpassing the predictive power of NB as measured on a wide variety of benchmark UC-Irvine datasets. EB is based on two ideas, which interact. The first is to find sets of seemingly dependent attributes and to add them as new attributes. The second is to exploit "zeroes", i.e., the negative evidence provided by attribute values that do not occur in particular classes in the training data. Zeroes are handled in Naive Bayes by smoothing (substituting a small positive value). In contrast, EB uses them as evidence that a potential class labeling may be wrong.; The second part of the thesis presents a theoretical analysis of skewing, a recent technique for improving the performance of standard decision tree algorithms [42]. Decision tree algorithms use the training data to build a decision tree that computes a function mapping examples to class labels. Standard decision tree algorithms perform poorly in learning certain "difficult" functions, such as parity, when irrelevant attributes are present, because of an inability to distinguish between relevant and irrelevant attributes. While experimental evidence indicates that skewing can remedy this problem, prior to the work in this thesis, there was almost no analysis of when and why skewing worked. We prove that, in an idealized setting, skewing can always identify relevant attributes. We also present an analysis of a variant of skewing called sequential skewing, and prove results concerning properties of the class of "difficult" functions.

Keywords/Search Tags:

Skewing, Bayes, Standard, Improvements, Decision tree, Class

Related items

1	The Research Of Change Decision Model In Automotive Electrical Research Project Based On Bayes And Decision Tree Method
2	Research On Hybrid Classification Based On Navie Bayes And Decision Tree
3	Research On Hybrid Classification Algorithm Based On Decision Tree And Na?ve-bayes In Intrusion Detection
4	Decision Tree Based On Bayes' Theorem To Extract Exception Rules
5	Study On Multi-class Classification Method Of Support Vector Machine Based On Decision Tree
6	The Research On An Improved Algorithm For Incremental Induction Of Decision Tree
7	Network Data Classification Technology Using An Improved Decision Tree
8	Research On Personal Credit Evaluation Based On Decision Tree Integration Algorithm
9	A New Class Of Two Degree Of Freedom Controller - Pec And Its Improvements
10	Research On Spam Behavior Patterns And Recognition Methods