In the modern society with the rapid development of computer technology, people's ability of data collection and storage capacity has been greatly improved, resulting in the field of machine learning and pattern recognition will inevitably encounter a lot of high-dimensional data, which called the curse of dimensionality. In order to avoid the curse of dimensionality" problem, it is necessary to reduce the dimension for high-dimensional data. Dimensionality reduction refers to the sample which in the high-dimensional data projectd onto the lower-dimensional space by linear or nonlinear mapping, while revealing the internal structure of information hidden in the data. It plays a crucial role as a way of dealing with the"curse of dimensionality".With the continuous update of the data acquisition and the continuous expansion of storage capacity, getting unlabeled samples is becoming more and more easier in some real-world applications. But it has always to pay a relatively large price to calibrate the samples. The unlabeled samples are fewer than the labeled samples. Only the unlabeled date or the labeled data is considered in the traditional machine learning methods. However the two data are often co-exist in many real problems .Semi-supervised learning came into being. Semi-supervised learning is very significant research topics because semi-supervised learning can use both unlabeled data and labeled data.In this paper, firstly, the study status on dimensionality reduction is reviewed. Secondly, we introduce the theory of semi-supervised dimensionality reduction. Finally, I propose the specific method about semi-supervised dimensionality reduction which can be generalized two things:1. We propose a novel semi-supervised dimensionality reduction method, which not only preserves the global and locality structure of unlabeled samples, but also use a small number of labeled samples.2. As a non-linear dimensionality reduction method,kernel method can effectively extract nonlinear features of data set and has no constraint on data distribution in original space. So we propose a kernel semi-supervised dimensionality reduction method.3. We show the usefulness of novel methods through experiments on a broad range of data sets. |