Font Size: a A A

Exploration Of Safe Graph-Based Semi-Supervised Learning And Automated Semi-Supervised Learning

Posted on:2019-03-03Degree:MasterType:Thesis
Country:ChinaCandidate:H WangFull Text:PDF
GTID:2428330545977967Subject:Computer technology
Abstract/Summary:PDF Full Text Request
In many machine learning application scenarios,it is often the case that abundant unlabeled training examples are available,but the labeled ones are fairly expensive to obtain since labeling examples requires much human efforts.As we all known that by using both labeled and unlabeled data,much better performance can be obtained by semi-supervised learning.However,there also exist some deficiencies:firstly,it has been found that the performances of current semi-supervised learning approaches may be even worse than purely using labeled data in many cases.Secondly,model selection and hyperparameter optimization for semi-supervised learning methods will consume large amounts of computing resources and human efforts.This dissertation focuses on semi-supervised learning,and the main innovative achievements are as follows(1)In terms of the issue that semi-supervised learning methods may adversely affect performance when using unlabeled data.We propose instance selection method for improving graph-based semi-supervised learning in order to reduce the chances of performance degeneration.Our basic idea is that given a set of unlabeled instances,it is not the best approach to exploit all the unlabeled instances;instead,we should exploit the unlabeled instances that are highly likely to help improve the performance,while not taking into account the ones with high risk.We develop both transductive and inductive variants of our method.Experiments on a broad range of data sets show that the chances of performance degeneration of our proposed method are much smaller than those of many state-of-the-art graph-based semi-supervised learning methods.(2)In terms of the issue that model selection and hyperparameter optimization for semi-supervised learning methods consume large amounts of computing resources and human efforts.This work design Automated Semi-Supervised Learning(Auto-SSL)system.Firstly,we use unsupervised learning methods for feature generation to get the characteristics of the dataset that can be computed efficiently to help determine which algorithm to use on a new dataset.Secendly,a large margin separation method is utilized to effectively optimize the hyperparameters.Extensive experimental results demonstrate that our proposed method can improve the robustness compared with semi-supervised learning methods,as well as achieve highly competitive results with auto-mated machine learning system Auto-sklearn.
Keywords/Search Tags:Semi-Supervised Learning, Safe Graph-Based SSL, Automated Machine Learning, Model Selection
PDF Full Text Request
Related items