Font Size: a A A

Research On Manifold Learning: Theories And Applications In Images

Posted on:2008-01-23Degree:DoctorType:Dissertation
Country:ChinaCandidate:Q H HuangFull Text:PDF
GTID:1118360215950559Subject:Signal and Information Processing
Abstract/Summary:PDF Full Text Request
With the progress of science, especially the development of information industry, we enter a brand-new information age. When doing research in information age, one is inevitably confronted with large volumes of high-dimensional data, especially image data. In real-world applications, observations represented as image data or vectors can be modeled as samples lying on or close to a low-dimensional nonlinear manifold possibly with noise. Hence, data reduction especially nonlinear dimensionality reduction is an important tool of data mining, and the goal of dimension reduction is to find out the low dimensional structure of the nonlinear manifold from the high dimensional data. So, manifold learning is an important tool of data mining, and the goal of it is to find out the hidden dimensional structure of the high dimensional image data.There are some common issues that determine the effectiveness of the manifold learning. The researching on intrinsic dimension estimating techniques has become an important research direction in the realm of high dimensional image data. How to exactly estimate the intrinsic dimension is helpful for people to discover the intrinsic configuration of the image data, and play a guiding role in dimension reduction and other subsequent processes. All manifold learning share a common characteristic in that they use a local structure on the data to globally map the manifold to a lower dimensional space. Although previous studies have pointed out relationships among various manifold learning, it is a novel direction to relate them within a kernel framework. Riemannian normal coordinates contain information about the direction and distance from a specific point on a manifold to other points nearby. It is worth to translate this technique from its original setting in differential geometry, to the task of manifold learning. To confront these proposed problems, we give relatively consummate answers.The contributions of this paper are as follows:1. A new algorithm for intrinsic dimension of image data is presented. Without a priori knowledge of the manifold's geometry or topology except for its dimension, the key goal is to construct a simplicial complex based on approximations to the tangent bundle of the manifold. An important property of the algorithm is that its complexity depends on the dimension of the manifold, rather than that of the embedding space. Successful examples are presented in the cases of reconstructing curves in the plane and space, surfaces in space, and intrinsic estimating of human face image; in addition, a case when the algorithm fails is analyzed.2. A new method for robust manifold learning is presented. Probabilistic subspace mixture models, as proposed over the last few years, are interesting methods for learning image manifolds. Their lack of a global mapping can be remedied by a recently developed method based on locally linear embedding, called locally linear coordination. However, for many practical applications, where outliers are common, this method lacks the necessary robustness. Here, the idea of robust mixture modeling by t-distributions is combined with probabilistic subspace mixture models. The resulting robust subspace mixture model is shown experimentally to give advantages in density estimation and classification of image data sets. It also solves the robustness problems of locally linear coordination, by introducing a weighted re-definition of the embedding step.3. First, We interpret several well-known algorithms for dimensionality reduction of manifolds as kernel methods. Isomap, graph Laplacian eigenmap, and locally linear embedding(LLE) all utilize local neighborhood information to construct a global embedding of the manifold, are described as kernel PCA on specially constructed Gram matrices, and illustrate the similarities and differences between the algorithms. Last, Isomap is one of widely-used low-dimensional embedding methods, where geodesic distances on a weighted graph are incorporated with the classical scaling (metric multidimensional scaling). In this paper we pay our attention to two critical issues that were not considered in Isomap, such as: (1) generalization property,(2) topological stability. Then a robust kernel Isomap method, armed with such two properties, is presented. The proposed method which relates Isomap to Mercer kernel machines, so that the generalization property naturally emerges, through kernel principal component analysis. For topological stability, we investigate the network flow in a graph, providing a method for eliminating critical outliers. The generalization property and topological stability of the robust kernel Isomap is confirmed through experiments with several (image) data sets.4. A fast manifold learning based on Riemannian normal coordinates is presented. This coordinate system is in a way a generalization of Cartesian coordinates in Euclidean space. In order to reduce the dimension of high dimensional data, our implementation currently uses Dijkstra's algorithm for shortest paths in graphs and some basic concepts from differential geometry. We expect this approach to open up new possibilities for analysis of image data, where the coordinate system is learned from experimental high-dimensional data rather than defined models.
Keywords/Search Tags:manifold learning, intrinsic dimension, kernel trick, Isomap, simplicial complex, image data
PDF Full Text Request
Related items