Font Size: a A A

Sequential Training Of Semi-supervised Classification Based On Sparse Gaussian Process Regression

Posted on:2013-01-21Degree:MasterType:Thesis
Country:ChinaCandidate:R Q HuangFull Text:PDF
GTID:2218330374467178Subject:Computer application technology
Abstract/Summary:PDF Full Text Request
This paper reports the research on the theory and algorithm for the informative vector machine (IVM) algorithm and the Gaussian process regression (GPR) algorithm, respectively. Then, by combining with the IVM and GPR algorithms, a sequential training method of semi-supervised classification, which is applicable to large-scale problems, is proposed.In real applications, semi-supervised learning usually has to deal with lots of unlabeled examples. As to many existing semi-supervised learning algorithms, for example, the Gaussian process based algorithms that we are concerned in this paper, the computation cost of training a GPR model is three power of the number of examples, which is mainly due to the requirement to invert a matrix. The high computational complexity makes the standard GP-based semi-supervised learning algorithms difficult to be directly applied to large-scale problems. However, if a semi-supervised learning algorithm lacks the ability to deal with large-scale problems, it is hard to demonstrate it's effectiveness in practical applications.In the field of machine learning applications, Gaussian process regression is a very important Bayesian approach. It has been extensively used in semi-supervised learning tasks. To overcome the drawbacks of many existing semi-supervised learning algorithms of high computational complexity and difficult application in large-scale data sets, we propose the sequential training method of semi-supervised classification. The proposed method proceeds as follows. Firstly, we use the IVM technique to train a sparse GPR classifier on part of labeled and unlabeled examples, the outputs are the targets of unlabeled examples and the representative points. Secondly, we go on using the representative points selected in the first step and some new examples in the remainder examples to train another sparse semi-supervised GPR classifier. Repeat the above two steps sequentially until all unlabeled examples are assigned targets. The sequential training method of semi-supervised learning applicable to large-scale data sets is obtained in the end. Besides the advantage of dealing with large-scale data sets, the proposed method is also appropriate to the case of online learning that the number of training examples is increasing.The sequential training method of semi-supervised classification is simple and easy to implement. The hyper-parameters are obtained easily by maximizing the marginal likelihood without resorting to expensive cross-validation technique. The evaluations of the proposed method on eight real world data sets comparing with two related methods reveal promising results.In addition, spare metric learning has become a research hotspot in machine learning field during these two years. Previously, we have proposed a sparse kernel regression model and studied its application in short-term traffic flow forecasting. In the end of this paper, we provide a brief introduction of the sparse kernel regression model and point out its relationship with the sequential training method of semi-supervised classification.
Keywords/Search Tags:Semi-supervised Classification, Gaussian Processes (GPs), Informative Vector Machine (IVM), Sparse Gaussian Process Regression, SequentialTraining
PDF Full Text Request
Related items