Font Size: a A A

Dimension Reduction In A Number Of Issues

Posted on:2011-10-01Degree:DoctorType:Dissertation
Country:ChinaCandidate:L HeFull Text:PDF
GTID:1118360305497349Subject:Computer software and theory
Abstract/Summary:PDF Full Text Request
Dimensionality reduction (DR) is one of the most important problems in machine learn-ing. There are four main contributions to this field in my thesis:(?) manifold learning is an unsupervised and nonlinear DR method, which has captured many attentions in the past few years. It is difficult to find a proper taxonomy and eval-uation criteria for manifold learning algorithms due to their diversity. We propose a taxonomy based on the design of the algorithms, including distance-preserving map-pings, graph embeddings and statistical methods; afterwards their cons and pros are discussed respectively; moreover, we focus on several other aspects:the complexity of the algorithms; the relationship between spectra and dimensions; the impact of noise to each category; the consequence in existence of "holes" in the parametric space; local-ity preserving rates which can detect whether the topological structure of the manifold is destroyed; magnification factors, principal spreading directions and their quantitive criteria which allows us to explore more detailed properties of the mapping; as an ap-plication in face recognition, we choose a proper algorithm for DR and it yields higher recognition rate than traditional linear DR methods;(?) as an important branch in manifold learning, the graph embedding algorithms (c.f. chapter 2) can be parametrized (including linearization and kernelization) and make a complete DR framework. But the O(N3) cost of optimizing the kernelized mod-els makes it impractical for real applications; we propose an approximation algorithm based on AP-initialized k-means, which has a lower quantization error and therefore is a better approximation to the Gram matrix than other methods with the same amount of exemplars; we analyze the approximation to different spectra and the experiments verify that different applications demand different spectra; we also find an error bound for measuring approximation of the mapping, which is proved to be bounded by the quantization error as well; direct approximation of the mapping function has intuitive interpretations compared to the strategy of approximating the Gram matrix and in our experiments on graph embedding we find the former has better performance with fewer parameters than the latter; in our previous work, we have compared some lin- earized graph embedding algorithms; with the approximation algorithms, we are able to compare them on some large-scale problems, from which we draw some interesting conclusions, e.g. those graph embeddings corresponding to minimal eigenvalues suffer from kernel functions with fast-descreasing spectra; local models might be realized via local kernels instead;(?) we propose a supervised DR algorithm based on recently-developed independence cri-teria using kernel methods; our analysis shows it yields comparable results to SDR algorithms such as KDR but it has lower computational cost of O(N2) and one mul-tiplication of size-N matrices while KDR takes O(N3) time; we discuss the potential problems of our models on some simulated data but in many practical problems our models yields comparable results to KDR; we also develop an algorithm to find an upper bound for SDR, which is not addressed in many literatures; a further analysis discloses the relationship between these algorithms and graph embedding; therefore graph embedding yields a reasonable initial value for these algorithms and cuts down the cost of random searching; to incorporate unsupervised information in these mod-els, Laplacian smoother is employed, which outperforms supervised models when the data is projected into low dimensional space; we point out a potential research direction of solving unsupervised DR and CCA problems with these algorithms;(?) ordinal relationship is very important in many practical problems because it reveals how the data are distributed on a manifold and in our experiments we find it improve the generalization capability of learners; we treat these problems in a different way from traditional discriminative tasks, which we call tendency learning; we show the differ-ences and connections of tendency learning with other well-known learning problems, e.g. classification is modelling the separation surface while tendency learning is mod-elling the transition; we analyze the feasibility of applying two linear models (SVM and PKLR) in tendency learning, which shows the latter is more suitable for tendency learning; the DAG-regularized model we obtain has to be solved with CCCP due to its non-convex constraints; our experiments on two simulated data and two real data show the DAG-regularized model for tendency learning outperforms other supervised and semi-supervised models especially when few data are labeled.
Keywords/Search Tags:machine learning, dimensionality reduction, manifold learning, kernel methods, graph embedding, independence criteria, supervised learning, semi-supervised learning, unsupervised learning
PDF Full Text Request
Related items