The Study Of Graph-based Semi-supervised Learning/Dimensionality Reduction Methods And Their Applications

Posted on:2011-03-11

Degree:Doctor

Type:Dissertation

Country:China

Candidate:J Gui

Full Text:PDF

GTID:1118360305966678

Subject:Pattern Recognition and Intelligent Systems

Abstract/Summary:

PDF Full Text Request

Recently, semi-supervised learning and dimensionality reduction have become hot topics in the field of machine learning. The goal of semi-supervised learning is to learn from partially labeled data. In this thesis, I focused on graph-based semi-supervised learning. Dimensionality reduction techniques can transform dataset X with dimensionality D into a new dataset Y with dimensionality d, while retaining the geometry of the data as much as possible. The dimensionality of the new data set, i.e., d is the intrinsic dimensionality. I make a through study on graph-based semi-supervised learning and dimensionality reduction methods. More concretely, the main work for this thesis can be summarized as follows:(1) Both supervised methods and unsupervised methods have been widely used to solve the tumor classification problem based on gene expression profiles. This paper introduces a semi-supervised graph-based method for tumor classification. Feature extraction plays a key role in tumor classification based on gene expression profiles, and can greatly improve the performance of a classifier. In this paper we proposed a novel feature extraction method for extracting tumor-related features. First the Wilcoxon rank-sum test was used for gene selection. Then gene ranking and discrete cosine transform are combined with principal component analysis for feature extraction. Finally, the performance was evaluated by semi-supervised learning algorithms.(2) A modified version for semi-supervised learning algorithm with local and global consistency was proposed in this paper. The new method adds the label information, and adopts the geodesic distance rather than Euclidean distance as the measure of the difference between two data points when conducting calculation. In addition, we add class prior knowledge into the cost function. It was found that the effect of class prior knowledge was different between under high label rate and low label rate. The experimental results show that the changes attain the satisfying classification performance better than the original algorithms.(3) A new subspace learning algorithm called locality preserving discriminant projections (LPDP) was proposed by adding the maximum margin criterion (MMC) into the objective function of locality preserving projections (LPP). LPDP remains the locality preserving characteristic of LPP and utilizes label information in MMC, which can maximize the between-class distance and minimize the within-class distance. Thus our proposed LPDP is a new method that combines manifold criterion and Fisher criterion and has more discriminant power and more suitable for recognition tasks than LPP which considers only the local information for clustering or classification tasks. Moreover, two kinds of tensorized (multilinear) forms of LPDP are also derived in this paper. One is iterative while the other is non-iterative. Finally, the proposed LPDP method is applied to face and palmprint biometrics and is examined using the Yale, ORL face image databases and the PolyU palmprint database. Experimental results show the effectiveness of the proposed LPDP and demonstrate that LPDP is a good choice for real-world biometrics applications.(4) Spectral regression discriminant analysis (SRDA) and its kernel version SRKDA are important subspace learning methods proposed recently, both of which have a free parameter, i.e., the regularization parameter. However, how to set this parameter automatically has not been well solved before. In SRDA, this regularization parameter was only set as a constant, which is obviously suboptimal. In this paper, we developed a new algorithm to automatically estimate the regularization parameter of SRDA based on the perturbation linear discriminant analysis (PLDA). We also proposed two methods for regularization parameter estimation of SRKDA. One is derived from the method of optimal regularization parameter estimation for SRDA (OR-SRDA). The other is to utilize the kernel version of PLDA. Experiments on different data sets demonstrate the effectiveness and feasisblity of proposed methods.

Keywords/Search Tags:

Graph-based semi-supervised learning, Dimensionality reduction, Multi-step dimensionality reduction, Locality preserving projections, Spectral regression discriminant analysis

PDF Full Text Request

Related items

1	Research On Graph-based Dimensionality Reduction And Its Applications
2	Embedded Space Based Dimensionality Reduction Modeling Research And Its Applications
3	Research On Semi-Supervised Dimensionality Reduction Based On Graph
4	Research On Generalized Canonical Correlation Analysis Of Data Dimensionality Reduction
5	Research And Application Of Dimensionality Reduction Techniques
6	Methods Of Feature Dimensionality Reduction Based On Semi-supervised Learning
7	T-mixture Models And Extended Locality Preserving Projections For Clustering And Dimensionality Reduction
8	Graph Based Semi-supervised Dimensionality Reduction Algorithm Research And Its Application
9	Semi-Supervised Clustering And Dimensionality Reduction With Their Applications
10	Research On Extreme Learning Machine Algorithm Based On Dimensionality Reduction