LibD3C2.0:an Ensemble Classifier Based On Clustering And Its Parallel Implementation

Posted on:2015-03-08

Degree:Master

Type:Thesis

Country:China

Candidate:W Q Chen

Full Text:PDF

GTID:2268330428461177

Subject:Computer application technology

Abstract/Summary:

PDF Full Text Request

A specified learning algorithm’s generalization performance is one of the focuses of machine learning. Ensemble learning combines different classifiers into one single model to achieve higher generalization performance than individual member classifier. There are two sub-processes in ensemble learning. First, the generation of based classifiers, the second is the combination of individual weak classifier. The success of an ensemble system rests on the diversity of weak classifiers. An intuition for the generation of based classifiers is constructing a large number of classifiers which aims to achieve higher performance. However, too many propose stricter requirement on the ability of existing computing power and storage capacity. Zhou proposed the theory of selective integration, that is "Many Could Be Better Than All":shrinking the number of based classifiers by removing the redundancy one that has litter contribution to the improvement of ensemble system. Theoretical analysis and prominent experiments show that selective integration superior to Boosting and Bagging.In this thesis, we mainly concentrate on improving the diversity of weak classifiers at the generation stage and designing an effective ensemble strategy to achieve high generalization ability. The main contribution of our work is as follows:1. The generation of weak classifiers:Taking the distribution of the original data set into consideration, we propose a random subspace method to manipulate the original data. The manipulated data then is used to train individual classifier aims at leveraging diversity and performance of weak classifiers.2. The selection of based classifiers:Analyzing existing diversity measures and choosing the disagreement measure as our diversity measure. Apply clustering algorithms to remove the redundancy one and use the subset of classifiers to ensemble.3. The combination of based classifiers:Proposing a hybrid approach. This hybrid model is based on affine propagation clustering and the framework of dynamic selection and circulating in combination with a sequential search method. 4. In addition, we present a parallel framework of ensemble learning based on our methods mentioned above, named LibD3C2.0, to cope with the problem arise when there are a large number of based classifiers.

Keywords/Search Tags:

Selective Ensemble Learning, Clustering, Dynamic Selection

PDF Full Text Request

Related items

1	Heuristic Selective Ensemble Learning Algorithm Based On Clustering And Dynamic Updating
2	Research On Selective Clustering Ensemble Based On Cluster Validity Index
3	Research On Selective Clustering Ensemble Algorithm Based On Fractal Dimension
4	Researches About Transfer Learning Algorithm Based On Ensemble Selection Methods
5	The Research On Selective Cluster Ensemble Learning
6	Research On Semi-supervised Selective Clustering Ensemble
7	Research On The Effectiveness Element Theory And Method Of Clustering Ensemble
8	Reasearch Of Selective Ensemble Learning And Its Appliacation
9	Research On Classifier Ensemble
10	Study Of Ensemble Learning Based On Selective Strategy