Font Size: a A A

Research On Classifier-selection-based Ensemble Learning Algorithm

Posted on:2021-01-18Degree:MasterType:Thesis
Country:ChinaCandidate:W H ChenFull Text:PDF
GTID:2428330611966951Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
Information technology has brought a more convenient and quick problem solving method to modern humans for their work,life,learning and entertainment,but at the same time,More and more heterogeneous data are generated in this way.How to effectively utilize machine learning methods to mine useful information from these data and apply it to our life is both an opportunity and a challenge for people engaged in machine learning and data mining.Ensemble learning is one of the most important research branches in the field of machine learning and data mining.When processing data,traditional classifier ensemble methods often improve the diversity of classifiers by selecting some features or samples to train a single classifier,so as to improve the generalization ability of the ensemble model.However,these methods also have some limitations: 1.the selection of some features or samples means that when training a single classifier,another part of features or samples will be discarded,which is easy to cause information loss in a single branch and lead to the decline of the accuracy of the base classifier;2.these methods seldom consider improving the classification effect of ensemble model by removing redundant and invalid classifiers.In order to solve the above limitation 1,a new ensemble learning method of hybrid dimensional-reduction forest(HDRF)is proposed to improve the diversity among the branches of the ensemble system,but at the same time more training sample information is reserved for the branches.Firstly,the effective features were segmented by the tree-based feature selection algorithm,and different training subsets were obtained by Bagging.Then,a sample-featured conversion process(SFTP)based on similarity between different samples is proposed to generate extended features for selected samples,and PCA is used to effectively reduce dimensions and remove noise features for unselected and extended features,so as to obtain compact and compensated new features.Aiming at the above limitation 2,this paper designs a new ensemble pruning framework for HDRF(HDRFPF)based on adaptive dynamic density pruning(FPP)based on the similarity between different classifiers and the classification effect of each basic classifier,and removes redundant and invalid classifiers by density clustering based on similarity and pruning based on classification effect.Experiments on 23 high dimensional data sets verify the classification performance of the proposed ensemble model,and the results show that its classification effect is better than the mainstream classifier ensemble methods.
Keywords/Search Tags:Data Mining, Ensemble Learning, Ensemble Pruning, Feature Transformation
PDF Full Text Request
Related items