Font Size: a A A

Graph Embedding and Nonlinear Dimensionality Reduction

Posted on:2012-11-30Degree:Ph.DType:Thesis
University:Columbia UniversityCandidate:Shaw, BlakeFull Text:PDF
GTID:2458390011952283Subject:Computer Science
Abstract/Summary:
Traditionally, spectral methods such as principal component analysis (PCA) have been applied to many graph embedding and dimensionality reduction tasks. These methods aim to find low-dimensional representations of data that preserve its inherent structure. However, these methods often perform poorly when applied to data which does not lie exactly near a linear manifold. In this thesis, I present a set of novel graph embedding algorithms which extend spectral methods, allowing graph representations of high-dimensional data or networks to be accurately embedded in a low-dimensional space. I first propose minimum volume embedding (MVE) which, like other leading dimensionality reduction algorithms, first encodes the high-dimensional data as a nearest-neighbor graph, where the edge weights between neighbors correspond to kernel values between points, and then embeds this graph in a low-dimensional space. Next I present structure preserving embedding (SPE), an algorithm for embedding unweighted graphs where similarity between nodes is not known. SPE finds low-dimensional embeddings which explicitly preserve graph topology, meaning a connectivity algorithm, such as k-nearest neighbors, will recover the edges of the input graph from only the coordinates of the nodes after embedding. I further explore preserving graph structure during embedding, and find the concept applicable to dimensionality reduction, large-scale network visualization, and metric learning for link prediction. This thesis posits that simply preserving pairwise distances, as with many spectral methods, is insufficient for capturing the structure of many datasets and that preserving both local distances and graph topology is crucial for producing accurate low-dimensional representations of networks and high-dimensional data.
Keywords/Search Tags:Graph, Embedding, Dimensionality reduction, Spectral methods, High-dimensional data, Low-dimensional, Preserving
Related items