Font Size: a A A

Nonparametric models for high-dimensional data analysis

Posted on:2013-08-12Degree:Ph.DType:Dissertation
University:The University of UtahCandidate:Gerber, SamuelFull Text:PDF
GTID:1458390008482725Subject:Statistics
Abstract/Summary:
This dissertation is about the analysis of high-dimensional data sets. Many disciplines face challenges that form naturally, or can be abstracted, as analysis tasks on point clouds that reside in a high-dimensional space. The list of applications ranges from gene expression analysis to the analysis of computer simulations and socioeconomic surveys. For instance, in neuroanatomy, the effect of pathologies, substance abuse, or normal growth on anatomy can be explored through population analysis of sets of brain magnetic resonance images, each consisting of thousands of measurements.;The focus of this dissertation is on the development of tools that support the data analyst in developing insight about the process or system that generated the observed data. The dissertation consists of a collection of six papers that deal with nonparametric methods to build human interpretable models of high-dimensional data sets. The fundamental challenge in the analysis of high-dimensional data sets is the exponential growth of possible structures in the data, also referred to as the curse of dimensionality. The six papers are split into two main topics that address the curse of dimensionality data sets in different ways.;The first topic (Chapters 2 through 5, papers one through four) considers dimensionality reduction and, in particular, manifold learning. The first paper highlights some of the challenges in existing manifold learning approaches. The second paper proposes a manifold estimation scheme based on the statistical framework of principal curves. The third paper proposes a solution to the problem of model complexity control for principal curves. The fourth paper considers an application of the proposed manifold estimation approach to population analysis of brain magnetic resonance images.;The second topic (Chapters 6 through 8, papers five and six) concerns a novel nonparametric technique to model and visualize high-dimensional scalar functions. The basic framework is a decomposition of the domain based on the Morse-Smale complex. The Morse-Smale complex splits the domain of the function into regions of similar behaviour of the function. The fifth paper exploits this decomposition to build a visualization system that supports intuitive understanding and exploration of the function. Based on the same domain decomposition, the sixth paper proposes a partition-based regression technique.
Keywords/Search Tags:High-dimensional data, Paper proposes, Nonparametric
Related items