Font Size: a A A

Study On Several Issues Of Spectral Method For Manifold Learning

Posted on:2010-07-20Degree:DoctorType:Dissertation
Country:ChinaCandidate:X H CengFull Text:PDF
GTID:1118360275963219Subject:Computer software and theory
Abstract/Summary:PDF Full Text Request
In the current information age,a large quantity of data can be obtained easily.The obtained data are high-dimensional,enormous,multifarious,disordered,and continuously increasing in many practical applications.The valuable information is submerged into large scale dataset.It is necessary to find the intrinsic laws of the dataset and predict the future development trend.Manifold learning assumes that these observed data lie on or close to intrinsic low-dimensional manifolds embedded in the high-dimensional Euclidean space.The main goal of manifold learning is to find intrinsic low-dimensional manifold structures of high-dimensional observed dataset and the embedding map.At present,manifold learning has become a hot issue in the fields of machine learning,pattern recognition,data mining and other related research.By analyzing the intension and extension of manifold learning,this dissertation is devoted to solving several important problems of spectral methods for manifold learning,and carries out a series of research on algorithm design and image manifold application.Firstly,traditional spectral methods are analyzed and contrasted in detail. Secondly,five problems are mainly investigated,which include knowledge-increasable learning of manifold,constructing a reasonable measure of neighborhood relation, enhancing the separability of the low-dimensional space,manifold learning based on ensemble,combining the advantages of local preserving algorithms and global preserving algorithms in manifold learning.Finally,this dissertation proposes five manifold learning algorithms based on spectral method.Our proposed algorithms are compared with the related research in theories and experiments.And the results show the effectiveness of our proposed algorithms.The main contributions of this dissertation are summarized as follows:(1) The concept of knowledge-increasable manifold learning is defined.It is advantageous to guide the research of manifold learning algorithms which fit to knowledge-increasable learning mechanism in human brain.According to the guiding principle of knowledge-increasable manifold learning,we propose a dynamically knowledge-increasable manifold learning algorithm based on locally linear embedding (DKI-LLE).Experimental results show that DKI-LLE algorithm has better results than several LLE-based incremental algorithms in dealing with the new dataset.DKI-LLE is nearer to the original(batch) LLE for discovering low-dimensional structure,and can integrate the low-dimensional structure knowledge of the new data subset into the previous low-dimensional structure.On the contrary,incremental LLE algorithms depend more on those previous low-dimensional coordinates for dealing with the new observed data.(2) A generalized Gaussian Laplacian eigenmap algorithm based on geodesic distance (GGLE) is proposed,which incorporates geodesic distance and generalized Gaussian function into the original Laplacian eigenmap algorithm.GGLE algorithm can adjust the similarities between nodes of neighborhood graph,and can preserve the different degrees of local properties by using super-Gaussian function,Gaussian function or sub-Gaussian function.Moreover,GGLE can avoid the deficiency of Euclidean distance by using geodesic distance when neighborhoods of data points are enlarged for preserving more neighborhood relations.Experimental results show that the global low-dimensional coordinates obtained by GGLE have different clustering properties and different degrees of preserving local neighborhood structures when different generalized Gaussian functions are used to measuring the similarities between high-dimensional data points.(3) An ensemble-based discriminant algorithm based on GGLE(EGGLE) is proposed. The main advantages of EGGLE include:the fixed neighborhood parameter k,only one time for constructing neighborhood graph and geodesic distance matrix,only needing to choose generalized Gaussian functions many times for constructing many Laplacian matrixes,obtaining multiple independent low-dimensional coordinate sets, independently training multiple classifiers,and combining classification results of these component classifiers to produce the final identification.EGGLE algorithm generally outperforms Ensemble-ISOMAP algorithm and En-ULLELDA algorithm on the time complexity.Under the framework for semi-supervised learning,the comparative experiments of EGGLE and LE are completed.Experimental results show that EGGLE algorithm is effective.In addition,a supervised manifold learning algorithm based on ensemble(EGGLE-LDA) is proposed in this dissertation.For enhancing the classification ability of ensemble-based manifold learning in supervised learning, EGGLE is combined with LDA,so that EGGLE-LDA algorithm takes into account the label information of data as well as the characterization of the geometric distribution. Experimental results show the difference of the ensemble-based recognition performance between EGGLE-LDA algorithm and En-ULLELDA algorithm.(4) A global Laplacian unfolding algorithm(GLU) is proposed so that LE algorithm with preserving local properties is integrated into MVU algorithm with preserving global properties.The main idea of GLU algorithm is that nearby points are pulled as near as possible while distant points are pulled as far apart as possible.Its implementation method is to construct the double object functions for remaining as near as possible in locality and unfolding in globality,to reformulate the double object functions in terms of the Gram inner product matrix of low-dimensional coordinates,to optimize the object function for learning the inner product matrix by semi-definite programming,and to compute a intrinsically low-dimensional embedding via the eigen-decomposition of the inner product matrix.Visualization experiments on synthetic "Two Moons" dataset,USPS handwritten digits dataset and the sculpture head portrait dataset demonstrate the effectiveness of GLU algorithm.In addition,the classification ability and visualization performance of four manifold learning algorithms that include LE,MVU,UDP and GLU are compared.Experimental results show the superiority of GLU algorithm.
Keywords/Search Tags:Manifold learning, Spectral method, Neighborhood graph, Laplacian, Knowledge-increasable, Generalized Gaussian function, Ensemble, Semi-definite programming, Unfolding
PDF Full Text Request
Related items