Font Size: a A A

Study On Unsupervised And Semi-Supervised Dimensionality Reduction Algorithms

Posted on:2011-02-08Degree:MasterType:Thesis
Country:ChinaCandidate:Q K ZhangFull Text:PDF
GTID:2178360305464243Subject:Signal and Information Processing
Abstract/Summary:PDF Full Text Request
In machine learning and pattern recognition, it is inevitable to encounter with high-dimensional data sets, which usually cause a problem called the curse of dimensionality. Then it is necessary to reduce the dimension of the data sets to avoid the curse of dimensionality. Dimensionality reduction aims to capture the structure of the data sets embedded in the low-dimensional space through the linear or non-linear mappings. The main purpose of this paper is to develop unsupervised and semi- supervised dimensionality reduction algorithms.Firstly, in the unsupervised dimensionality reduction, Gaussian process latent variable model (GP-LVM) is an effective algorithm of dimensionality reduction. It provides a smooth mapping from the latent space to the data space. This specifies that points which are close in latent space will be still mapped as close in data space. However, it does not guarantee that points which are close in data space will be close in the latent space. In order to overcome this drawback, a latent variable model based on locality preserving projections is proposed. The latent variable model can force points which are close in data space to be close in the latent space. Experimental results on several datasets demonstrate the effectiveness of this method.Secondly, in the semi-supervised dimensionality reduction, except the label of instances, there is another kind of supervised information, i.e., pairwise constraints. Pairwise constraints indicate whether two instances belong to the same class or not. However, the existing algorithms based on pairwise constraints have not exploited the intrinsic properties of constraints, such as transitivity and exclusivity. Therefore, two semi-supervised dimensionality reduction algorithms are proposed. One is a dimensionality reduction by using global preserving algorithm based on pairwise constraints, which can not only exploit the transitivity and exclusivity, but also preserve the global structure of the data manifold in low dimensional embedding space; the other is a dimensionality reduction by using local preserving based on pairwise constraints, which can preserve the transitivity and exclusivity as well as the local structure of the data manifold. Experiments on several data sets show that it is superior to other dimensionality reduction methods.
Keywords/Search Tags:Dimensionality reduction, Gaussian process, Latent variable model, Pairwise constraints
PDF Full Text Request
Related items