Font Size: a A A

The Research Of Multiple Classifiers Ensemble Algorithm

Posted on:2010-04-12Degree:MasterType:Thesis
Country:ChinaCandidate:B ChenFull Text:PDF
GTID:2178360275962602Subject:Computer software and theory
Abstract/Summary:PDF Full Text Request
In recent years, many kinds of fusion methods have been widely used in the identification of human face, hand-written characters, remote sense images, etc.Classifiers ensemble technique makes use of the complementarities amongst different classifiers to improve the performance of classifiers after combination. Generally, we usually depend on improving the classification capability of each component classifier and increasing the diversity among the component classifiers to improve the performance of classifiers ensemble. But there are still some disadvantages in traditional classifiers ensemble methods, for example, the nature of every classifier in the ensemble is single, and when the classifiers are selected, the characteristics of datasets are not fully considered, so that the samples are not well identified by the classifiers. In order to realize the best performance for almost all the samples, we need to choose some compatible classifiers according to concrete recognition objects, meanwhile, that using different ensemble methods according to different types of sample is to be considered.The purpose of the research is that how to achieve the diversity of component classifiers, in the way, the performance of classifiers ensemble can be improved. In the aspect of disserting better performance of single classifier, we fully consider how to utilize the training sample distribution to realize the diversity of component classifiers. The main contributions of this dissertation are summarized as follows:On the one hand, two methods of dynamic ensemble of multiple classifiers on ensemble learning are DEA and EMDA. First, as for the DEA algorithm, the training dataset is partitioned into several smaller sets according to its class labels. And under the direction of the label's number of training dataset, the test dataset is partitioned by clustering. Then the corresponding relationship between the clustering sets of test dataset and the smaller sets of training dataset is found according to the distance of Euclid of Alexandria. Member classifiers of different types are trained by using different methods of classification algorithm based on Adaboost, which are trained in every smaller sample of the training samples. The better performance classifiers can be obtained, which are used to classify test data sets. Inspired by DEA algorithm, EMDA of selecting component classifier is worthy of seeking. The better performance classifiers are selected by error rate in DEA algorithm. Here, information entropy is used as EMDA of selecting better performance classifiers. Then we can show the classification performance of DEA and EMDA. We implement the two algorithms on the Weka platform, and compare the results with that of Adaboost. The experimental results on 15 data sets show that DEA and EMDA have higher accuracy and better generalization ability.On the other hand, an approach of multiple classifiers ensemble based on feature selection (FSCE) is proposed in the paper. After attributes of the training data set are specially selected, the new data set is mapped into new training data sets. There is the number of attributes (the class attribute is ignored) of the new data sets. Then classifiers with better performance are selected from the classifiers that are trained in every small training data set. They are used to classify the corresponding small testing data sets that are disposed by attribute selection. We implement FSCE on the Weka platform, and test on 12 data sets. And FSCE is compared classification efficiency with member classifiers trained based on the algorithm of Adaboost. In this way, the utility of FSCE can be proved.
Keywords/Search Tags:classifiers ensemble, information entropy, clustering, Adaboost algorithm, feature selection
PDF Full Text Request
Related items