A statistical approach to improving accuracy in classifier ensembles

Posted on:2009-07-13

Degree:Ph.D

Type:Dissertation

University:University of Massachusetts Amherst

Candidate:Holness, Gary F

Full Text:PDF

GTID:1448390005957632

Subject:Artificial Intelligence

Abstract/Summary:

Popular ensemble classifier induction algorithms, such as bagging and boosting, construct the ensemble by optimizing component classifiers in isolation. The controllable degrees of freedom in an ensemble include the instance selection and feature selection for each component classifier. Because their degrees of freedom are uncoupled, the component classifiers are not built to optimize performance of the ensemble, rather they are constructed by minimizing individual training loss. Recent work in the ensemble literature contradicts the notion that a combination of the best individually performing classifiers results in lower ensemble error rates. Zenobi et al. demonstrated that ensemble construction should consider a classifier's contribution to ensemble accuracy and diversity even at the expense of individual classifier performance. To tradeoff individual accuracy against ensemble accuracy and diversity, a component classifier inducer requires knowledge of the choices made by the other ensemble members.;We introduce an approach, called DiSCO, that exercises direct control over the tradeoff between diversity and error by sharing ensemble-wide information on instance selection during training. A classifier's contribution to ensemble accuracy and diversity can be measured as it is constructed in isolation, but without sharing information among its peers in the ensemble during training, nothing can be done to control it. In this work, we explore a method for training the component classifiers collectively by sharing information about training set selection. This allows our algorithm to build ensembles whose component classifiers select complementary error distributions that maximize diversity while minimizing ensemble error directly. Treating ensemble construction as an optimization problem, we explore approaches using local search, global search and stochastic methods.;Using this approach we can improve ensemble classifier accuracy over bagging and boosting on a variety of data, particularly those for which the classes are moderately overlapping. In ensemble classification research, how to use diversity to build effective classifier teams is an open question. We also provide a method that uses entropy as a measure of diversity to train an ensemble classifier.

Keywords/Search Tags:

Ensemble, Classifier, Accuracy, Diversity, Approach, Bagging and boosting

Related items

1	Research On Classifier Ensemble
2	Research On Adaptive Boosting Algorithm And Ensemble Classifier
3	The Research Of Two-stage Feature Selection Ensemble Classifier Based On Bagging
4	Project Analysis For Face Recognition Based On Ensemble Learning
5	Research On Classiifer Ensemble Based On Decision Tree
6	Research On Classifier Selective Ensemble Method And Thire Diversity Measurement
7	Research On Ensemble Learning Algorithm
8	Selective Ensemble Method Based On Diversity Measures
9	Research On Ensemble Technique For Multiple Classifiers
10	Online ensemble learning