Dimension Reduction In A Number Of Issues  Posted on:20111001  Degree:Doctor  Type:Dissertation  Country:China  Candidate:L He  Full Text:PDF  GTID:1118360305497349  Subject:Computer software and theory  Abstract/Summary:  PDF Full Text Request  Dimensionality reduction (DR) is one of the most important problems in machine learning. There are four main contributions to this field in my thesis:(?) manifold learning is an unsupervised and nonlinear DR method, which has captured many attentions in the past few years. It is difficult to find a proper taxonomy and evaluation criteria for manifold learning algorithms due to their diversity. We propose a taxonomy based on the design of the algorithms, including distancepreserving mappings, graph embeddings and statistical methods; afterwards their cons and pros are discussed respectively; moreover, we focus on several other aspects:the complexity of the algorithms; the relationship between spectra and dimensions; the impact of noise to each category; the consequence in existence of "holes" in the parametric space; locality preserving rates which can detect whether the topological structure of the manifold is destroyed; magnification factors, principal spreading directions and their quantitive criteria which allows us to explore more detailed properties of the mapping; as an application in face recognition, we choose a proper algorithm for DR and it yields higher recognition rate than traditional linear DR methods;(?) as an important branch in manifold learning, the graph embedding algorithms (c.f. chapter 2) can be parametrized (including linearization and kernelization) and make a complete DR framework. But the O(N3) cost of optimizing the kernelized models makes it impractical for real applications; we propose an approximation algorithm based on APinitialized kmeans, which has a lower quantization error and therefore is a better approximation to the Gram matrix than other methods with the same amount of exemplars; we analyze the approximation to different spectra and the experiments verify that different applications demand different spectra; we also find an error bound for measuring approximation of the mapping, which is proved to be bounded by the quantization error as well; direct approximation of the mapping function has intuitive interpretations compared to the strategy of approximating the Gram matrix and in our experiments on graph embedding we find the former has better performance with fewer parameters than the latter; in our previous work, we have compared some lin earized graph embedding algorithms; with the approximation algorithms, we are able to compare them on some largescale problems, from which we draw some interesting conclusions, e.g. those graph embeddings corresponding to minimal eigenvalues suffer from kernel functions with fastdescreasing spectra; local models might be realized via local kernels instead;(?) we propose a supervised DR algorithm based on recentlydeveloped independence criteria using kernel methods; our analysis shows it yields comparable results to SDR algorithms such as KDR but it has lower computational cost of O(N2) and one multiplication of sizeN matrices while KDR takes O(N3) time; we discuss the potential problems of our models on some simulated data but in many practical problems our models yields comparable results to KDR; we also develop an algorithm to find an upper bound for SDR, which is not addressed in many literatures; a further analysis discloses the relationship between these algorithms and graph embedding; therefore graph embedding yields a reasonable initial value for these algorithms and cuts down the cost of random searching; to incorporate unsupervised information in these models, Laplacian smoother is employed, which outperforms supervised models when the data is projected into low dimensional space; we point out a potential research direction of solving unsupervised DR and CCA problems with these algorithms;(?) ordinal relationship is very important in many practical problems because it reveals how the data are distributed on a manifold and in our experiments we find it improve the generalization capability of learners; we treat these problems in a different way from traditional discriminative tasks, which we call tendency learning; we show the differences and connections of tendency learning with other wellknown learning problems, e.g. classification is modelling the separation surface while tendency learning is modelling the transition; we analyze the feasibility of applying two linear models (SVM and PKLR) in tendency learning, which shows the latter is more suitable for tendency learning; the DAGregularized model we obtain has to be solved with CCCP due to its nonconvex constraints; our experiments on two simulated data and two real data show the DAGregularized model for tendency learning outperforms other supervised and semisupervised models especially when few data are labeled.  Keywords/Search Tags:  machine learning, dimensionality reduction, manifold learning, kernel methods, graph embedding, independence criteria, supervised learning, semisupervised learning, unsupervised learning  PDF Full Text Request  Related items 
 
