Font Size: a A A

Semi-Supervised Ensemble For Classification Learning

Posted on:2018-01-19Degree:MasterType:Thesis
Country:ChinaCandidate:X N ChengFull Text:PDF
GTID:2348330536477763Subject:Probability theory and mathematical statistics
Abstract/Summary:PDF Full Text Request
Ensemble learning and semi-supervised learning are two important research directions in machine learning.Semi-supervised learning is mainly studied how to improve the learning effect of the classifier using the unlabeled sample,ensemble learning belongs to the supervised learning,it studies that how to combine a number of different classifiers,so that the final effect is stronger than the result of a single classifier.In the sample space,comparing with unlabeled samples,there is a very small number of label samples,so using the marked samples will lose the hidden information of unlabeled samples.Ensemble learning is one of the representative algorithms using labeled samples,this paper studies how to combine ensemble learning and semi-supervised learning.One of the representative algorithms using label samples,this paper studies how to combine the ensemble learning and semi-supervised learning to improve the learning ability of the classifier.This paper focuses on the problem of multi-classification,which is different from the traditional multi-classification problem.the traditional method of solving the problem of multi-classification is to transform the multi-classification problem into one-to-many,one-toone or many-to-many problems,and this paper transforms the category label of the multiclassification problem into binary vector.This paper improves the semi-supervised ensemble learning using label propagation SSE2 in the literature and designs an ensemble semisupervised classification algorithm based on label sample loss.This algorithm uses the Bagging integration method as the algorithm framework and the SVM is chosen as the base classifier,on the one hand,the sub-classifier is created using all the labeled samples,twenty different SVM classifiers with differences were established,the weighted list of unlabeled samples is randomly selected by Bootstrap method,and the KNN classifier is used again.The label sample is expanded with the label sample data set.In this paper.Of course it can also be used to other classifiers and the final classification result is obtained by the majority voting method.Experiments were performed on 11 UCI public data sets by cross validation.The results are following: Firstly,according to the different data sets,the same classifier results are different,and the performance of the classifier is also increasing with the label samples.Secondly,when the labeled samples account for 10% and 20%,the correct rate of the improved ensemble semi-supervised classification algorithm is higher in the data set than the ensemble semi-supervised algorithm that does not consider the label sample loss and the KNN selection of the unlabeled sample on the average increase of 5.77% and 3.52%.
Keywords/Search Tags:Ensemble learning, semi-supervised learning, Bagging, Cross-validation
PDF Full Text Request
Related items