Font Size: a A A

Research Of Manifold Learning In Data Dimension Reduction And Classification

Posted on:2008-03-14Degree:DoctorType:Dissertation
Country:ChinaCandidate:X M LiuFull Text:PDF
GTID:1118360212984898Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
With the quick advancement and extensive application of information technology, more data with high dimension and complex structure occurs very quickly. High dimension not only makes the data hard to understand, and makes traditional machine learning and data mining techniques less effective. Dimension reduction is one of the important techniques to deal with high-dimensional data; while there have been many research work done in this field, there are still challenging problems to uncover the linear and nonlinear structure information in data. In 2000, three articles published on Science magazine studied the dimension reduction issue from the perspectives of neuroscience and computer science respectively, which further accelerate the research of this field, and promote the manifold learning methods for dimension reduction become one of the hot problems in machine learning.This dissertation deals with manifold learning for dimension reduction and its application, studies the problem from the both the linear and nonlinear, unsupervised and supervised perspectives. The main contributions of this dissertation can be summarized as follows:In the perspective of supervised global linear dimension reduction, a Pairwise Covariance-preserving Projection Method (PCPM) is proposed, which maximizes the class mean distance and also preserves approximately the distance of pairwise class covariances. The optimization involved in PCPM can be solved directly by eigenvalues decomposition. Theoretical analysis reveals the relationship between PCPM and Linear Discriminant Analysis (LDA), Sliced Average Variance Estimator (SAVE), Heteroscedastic Discriminant Analysis (HDA) and Covariance-preserving Projection Method (CPM). Furthermore, pairwise weights related to Bayes classification accuracy are incorporated naturally with the pairwise summarization form, and a weighted PCPM is also proposed.In the perspective of unsupervised nonlinear dimension reduction, an incremental version ILTSA of LTSA (Local Tangent Space Alignment) is proposed, which can greatly reduce the computation time demand of LTSA algorithm to deal with new data. For a new sample data, the low-dimensional affine subspaces of affected points are updated firstly. Through minimizing the reconstruction errors of data point as respect to existing points, the global coordinates and local alignment matrix of existing points are obtained and the global coordinate of new data point is obtained in the least square sense. Finally, the global coordinates of all data points are updated with Rayleigh-Ritzacceleration. Besides, a landmark version of LTSA (LLTSA) is proposed, where landmarks are selected based on LASSO regression, which can reduce the memory demand of the algorithm. Furthermore, an incremental version (ILLTSA) of LLTSA is also proposed.In the perspective of supervised nonlinear dimension reduction, a new transductive classification method based on local tangent space alignment (LTSA) and transductive k-nearest neighbors is proposed. In the method, an improved 2-stage LDA/QR method is used to construct local low-dimensional coordinates, which can not only utilize the label information of sample data, but can also conquer the singularity problem of traditional LDA. Then the global low-dimensional embedding coordinates are obtained with LTSA, and TCM-KNN method is used for classification on the low-dimensional manifold finally.In the perspective of linear approximation of nonlinear dimension reduction, an Orthogonal Neighborhood Preserving Embedding method is proposed to overcome the sensitivity of dimension estimation problem of Neighborhood Preserving Embedding method. The method obtains the low dimensional coordinates by iteratively computes the mutually orthogonal basis functions, can assure the orthogonality of the projection matrix. Moreover, utilizing the local geometry during ONPE dimension reduction, a new classification method based on label propagation method (LNP) is proposed.In the perspective of supervised linear dimension reduction with matrix representation, a method called two-dimensional locality sensitive discriminant analysis (2DLSDA) for image recognition is proposed, which is based directly on 2D image matrices, can overcome the singularity problem and can utilize the spatial information among pixels more effectively. Two orthogonal transform matrixes are computed based on Schur decomposition. Based on Schur decomposition, the projection matrixes are orthogonal and can be obtained more efficient and numerical stable. Meanwhile, based on the unfolding way of image matrices, two unilateral 2DLSDA methods are proposed.
Keywords/Search Tags:manifold learning, linear dimension reduction, nonlinear dimension reduction, local tangent space, transductive classification, covariance preserving, locality sensitive discriminant analysis, matrix representation, LDA/QR
PDF Full Text Request
Related items