Font Size: a A A

Research On Subspace Ensemble Learning

Posted on:2021-01-15Degree:DoctorType:Dissertation
Country:ChinaCandidate:D X WangFull Text:PDF
GTID:1368330611467081Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
For the last few decades,ensemble learning methods have drawn the attention of many researchers.Most traditional single learning algorithms have their own limitations and not suit-able for handing all kinds of data.For example,the kernel-based method is suitable for high-dimensional data with small sample size,while the convolutional neural network-based method requires sufficient samples.Ensemble learning methods can combine the results of multiple learners and reduce the error caused by a single learner.Researchers have spent much efforts in ensemble learning in various fields such as classification,clustering,semi-supervised learn-ing,etc.Increasing the accuracy and diversity in an ensemble is the core problem of ensemble learning.Increasing the accuracy and diversity of the members is beneficial for the result of the ensemble.In order to deal with this problem,researchers proposed many different methods to process data.These methods changes machine learning algorithms in different ways.The sub-space method process data from the perspective of features,and the sampling methods process data from the perspective of data samples.This thesis focus on the subspace method.The subspace method do not use all the fea-tures in the classification or clustering process and the learning is processed on low dimensional subspace of data.Usually the features are selected randomly,so this method is often used in ensemble learning to increase the diversity of the ensemble.Using subspace technique in clas-sification or clustering can reduce the impact of redundant features,increase the diversity in the ensemble,reduce the data processing time and improve the overall performance of the ensem-ble learning algorithm.In the thesis,the subspace method is combined with different machine learning problems and applied in the field of classification,clustering and stream classification.Usually the subspace method is used independently,but if it can be used together with other methods such as data sampling,it can further improve the results of the ensemble learning.Al-though most algorithms only considered the feature dimension of the data or the sample dimen-sion of the data,we combined the optimization of the feature dimension and the data dimension to provide better results.We also combine subspace technology with multi-view learning tech-nology to solve clustering problems,and combine subspace technique with ensemble selection for classification.The main work of this thesis is described as follows:(1)To deal with of problem of finding suitable subspace in classification problems,a pro-gressive subspace ensemble learning algorithm(PSEL)is proposed.In this algorithm,we first combine the random subspace technique and the data sampling technique to generate the initial set of classifiers.Then we designed a progressive selection process,using the short-term cost function and long-term cost function we defined to select the classifiers,then the final result is calculated by weighted voting.We compared PSEL with some existing algorithms and got better results.(2)To deal with of problem of finding suitable subspace in clustering problems,a clustering ensemble framework based on multi-view learning is proposed to solve the clustering problem.We first propose 3 view transformation methods to transform the features of the data to obtain transformed data.Then,we proposed a new clustering ensemble algorithm based on the 3 view transformation methods and the multi-view learning algorithm named Random Transformation and Hybrid Multi-view Learning based Clustering Ensemble(RTHMC).We further proposed the SRTHMC algorithm which combines RTHMC with random subspace technique and the SORTHMC method which further includes adaptive optimization algorithm in the SRTHMC.We also compare the algorithm with the some existing clustering ensemble algorithms and ob-tained better results.(3)To deal with of problem of finding suitable subspace in stream classification problems,Double Optimization Stream Data Subspace Classification Ensemble(DOSDSCE)is proposed in the thesis.DOSDSCE combines subspace selection and sample selection.It generates new subspace and train new classifier when processing new data chunks,and removes the classifiers with low weights in the ensemble.A method to use a multi-objective optimization algorithm to select new data samples and update the old classifiers is also proposed.
Keywords/Search Tags:classification ensemble, clustering ensemble, random subspace, multi-view learn-ing, ensemble selection, stream data classification
PDF Full Text Request
Related items