Font Size: a A A

Adaptive Semi-supervised Ensemble Classification In High Dimensional Data

Posted on:2020-10-11Degree:MasterType:Thesis
Country:ChinaCandidate:Y D ZhangFull Text:PDF
GTID:2428330590961117Subject:Computer technology
Abstract/Summary:PDF Full Text Request
Currently,machine learning has been widely used in various industries.For the task of pattern classification,the acquisition of labeled data require a lot of manpower and material resources.On the contrary,unlabeled data are often easy to collect,and many data are characterized by high dimensionality.In this situation,semi-supervised classification for high-dimensional data is generated.Most of the current semi-supervised classification algorithms,especially graph-based semi-supervised classification algorithms,focus on how to fit unlabeled samples and labeled samples to obtain the sample distributions,while ignoring the noise and redundancy of samples when dealing with high-dimensional data.In the face of high-dimensional data,the general research is to carry out semi-supervised feature extraction methods(dimension reduction or manifold learning)and does not combine unlabeled information into the classification process.Feature selection is only combined with a labeled view for selection.However,the traditional random subspace technology assigns the same importance to the basic semi-supervised classifiers in the ensemble.In fact,different basic semi-supervised classifiers have different contributions to the final result.Different ways of selecting unlabeled samples will lead to randomness in the learning process,so the subspace is easy to generate errors.And they are sensitive to different parameter values,which will lead to unstable results.In this paper,we starting from the perspective of subspace,and propose two semi-supervised classifier ensemble algorithms.These two algorithms are based on the idea of heuristic selection algorithm and co-learning.The single-objective or multi-objective optimization selection of feature selection is performed by using the auxiliary training set with high confidence of samples,and the classifier training is combined with the manifold learning theory.The auxiliary training set is used to perform the weight learning of the subspace in the local and global subspace selection.The first algorithm is Adaptive selection Semi-supervised Classifier Ensemble(ASCE).The selection for feature subspaces of ASCE is based on single-objective selection.And the final ensemble strategy is to optimize the subspace weight from the global view.The second algorithm is Multi-Objective Semi-supervised Classifier Ensemble(MOSCE).MOSCE focuses on the relevance of features,the redundancy between features,and the data reconstruction error for multi-objective feature selection.Finally,the auxiliary training set is used to select the local optimal subspace,and the performance of the classifier corresponding to each single subspace is improved.ASCE and MOSCE are closely combined with semi-supervised learning,which can improve the accuracy and robustness of the semi-supervised classification for high-dimensional datasetsIn the experiments,the performance of two semi-supervised classifier ensemble algorithms is verified on 18 high-dimensional datasets.ASCE and MOSCE are not only compare with the state-of-the-art semi-supervised classifiers algorithms,but also the key techniques and the parameter sensitivity of the algorithm are analyzed.The experimental results show that MOSCE and ASCE can achieve better classification results on the semi-supervised classification for high-dimensional datasets.
Keywords/Search Tags:Semi-Supervised Classification, Ensemble Learning, Adaptive Optimization Selection, High Dimensional Data
PDF Full Text Request
Related items