Algorithmic stability and ensemble-based learning

Posted on:2003-06-05

Degree:Ph.D

Type:Dissertation

University:The University of Chicago

Candidate:Kutin, Samuel Alan

Full Text:PDF

GTID:1468390011489179

Subject:Computer Science

Abstract/Summary:

We explore two themes in formal learning theory. We begin with a detailed, general study of the relationship between the generalization error and stability of learning algorithms. We then examine ensemble-based learning from the points of view of stability, decorrelation, and threshold complexity.; A central problem of learning theory is bounding generalization error. Most such bounds have been obtained through uniform convergence, via some variant of VC dimension. These analyses focus on the space of classifiers available to a learning algorithm, rather than on the algorithm itself.; Bousquet and Elisseeff (2002) have shown, using McDiarmid's method of independent bounded differences (1989), that algorithmic stability implies good bounds on generalization error. However, their definition of stability is too restrictive to be widely applicable.; We introduce the more general notion of training stability. We show that training stability implies good bounds on generalization error even when the learner has infinite VC dimension. Our proof requires a result of independent interest: we generalize McDiarmid's theorem to the case when differences are bounded with high probability.; In the Probably-Approximately-Correct setting of Valiant (1984), training stability is necessary and sufficient for good error performance and serves as a distribution-dependent analog of VC dimension. This enables us to emphasize the learning algorithm instead of the classifier space.; We next consider algorithms (e.g., boosting) which construct ensembles of weak classifiers. We show that AdaBoost (Freund and Schapire, 1997), a widely-used boosting algorithm, is stability-preserving.; We demonstrate that some boosting algorithms implicitly construct classifiers which are decorrelated (i.e., rarely jointly wrong). We present two new algorithms which build ensembles of explicitly decorrelated classifier pairs. Experiments indicate that these new algorithms are sometimes competitive with AdaBoost and with bagging (Breiman, 1996). The experiments employ a fast new algorithm for finding decorrelated pairs of decision stumps.; Lastly, we use threshold complexity to show that, under certain conditions, AdaBoost takes a long time to converge. In some situations, our pair-based algorithms are exponentially faster than AdaBoost. We analyze boosting as a dynamical system over the space of sequences of weights.

Keywords/Search Tags:

Algorithm, Stability, VC dimension, Generalization error, Boosting, Adaboost

Related items

1	Study On The Residence Error, Stability, And Generalization Capability Of Extreme Learning Machine
2	Theoretical Recsearch On Boosting
3	Boosting Algorithms Via Margin Theory
4	Boosting, margins, and dynamics
5	A Heuristic Algorithm Of Training RBFNN Based On Localized Generalization Error Model
6	Research On Margin Distribution Based Boosting Algorithms
7	Boosting Approach And Application Research On Image Understanding
8	Research On Classification Algorithm Of Boosting And Its Applications
9	Incremental Extreme Learning Machine Based On Error Constraint
10	Study On Boosting Algorithm Of Neural Network Ensemble