Font Size: a A A

The Study Of Graph-based Semi-supervised Learning/Dimensionality Reduction Methods And Their Applications

Posted on:2011-03-11Degree:DoctorType:Dissertation
Country:ChinaCandidate:J GuiFull Text:PDF
GTID:1118360305966678Subject:Pattern Recognition and Intelligent Systems
Abstract/Summary:PDF Full Text Request
Recently, semi-supervised learning and dimensionality reduction have become hot topics in the field of machine learning. The goal of semi-supervised learning is to learn from partially labeled data. In this thesis, I focused on graph-based semi-supervised learning. Dimensionality reduction techniques can transform dataset X with dimensionality D into a new dataset Y with dimensionality d, while retaining the geometry of the data as much as possible. The dimensionality of the new data set, i.e., d is the intrinsic dimensionality. I make a through study on graph-based semi-supervised learning and dimensionality reduction methods. More concretely, the main work for this thesis can be summarized as follows:(1) Both supervised methods and unsupervised methods have been widely used to solve the tumor classification problem based on gene expression profiles. This paper introduces a semi-supervised graph-based method for tumor classification. Feature extraction plays a key role in tumor classification based on gene expression profiles, and can greatly improve the performance of a classifier. In this paper we proposed a novel feature extraction method for extracting tumor-related features. First the Wilcoxon rank-sum test was used for gene selection. Then gene ranking and discrete cosine transform are combined with principal component analysis for feature extraction. Finally, the performance was evaluated by semi-supervised learning algorithms.(2) A modified version for semi-supervised learning algorithm with local and global consistency was proposed in this paper. The new method adds the label information, and adopts the geodesic distance rather than Euclidean distance as the measure of the difference between two data points when conducting calculation. In addition, we add class prior knowledge into the cost function. It was found that the effect of class prior knowledge was different between under high label rate and low label rate. The experimental results show that the changes attain the satisfying classification performance better than the original algorithms.(3) A new subspace learning algorithm called locality preserving discriminant projections (LPDP) was proposed by adding the maximum margin criterion (MMC) into the objective function of locality preserving projections (LPP). LPDP remains the locality preserving characteristic of LPP and utilizes label information in MMC, which can maximize the between-class distance and minimize the within-class distance. Thus our proposed LPDP is a new method that combines manifold criterion and Fisher criterion and has more discriminant power and more suitable for recognition tasks than LPP which considers only the local information for clustering or classification tasks. Moreover, two kinds of tensorized (multilinear) forms of LPDP are also derived in this paper. One is iterative while the other is non-iterative. Finally, the proposed LPDP method is applied to face and palmprint biometrics and is examined using the Yale, ORL face image databases and the PolyU palmprint database. Experimental results show the effectiveness of the proposed LPDP and demonstrate that LPDP is a good choice for real-world biometrics applications.(4) Spectral regression discriminant analysis (SRDA) and its kernel version SRKDA are important subspace learning methods proposed recently, both of which have a free parameter, i.e., the regularization parameter. However, how to set this parameter automatically has not been well solved before. In SRDA, this regularization parameter was only set as a constant, which is obviously suboptimal. In this paper, we developed a new algorithm to automatically estimate the regularization parameter of SRDA based on the perturbation linear discriminant analysis (PLDA). We also proposed two methods for regularization parameter estimation of SRKDA. One is derived from the method of optimal regularization parameter estimation for SRDA (OR-SRDA). The other is to utilize the kernel version of PLDA. Experiments on different data sets demonstrate the effectiveness and feasisblity of proposed methods.
Keywords/Search Tags:Graph-based semi-supervised learning, Dimensionality reduction, Multi-step dimensionality reduction, Locality preserving projections, Spectral regression discriminant analysis
PDF Full Text Request
Related items