Font Size: a A A

Machine Learning over Thoroughly Unstructured Data

Posted on:2016-06-02Degree:Ph.DType:Thesis
University:The University of Wisconsin - MadisonCandidate:Ansari, M. HidayathFull Text:PDF
GTID:2478390017981329Subject:Computer Science
Abstract/Summary:
This thesis examines a class of problems in which the spatial layout (shape) of data points enables inductive inference. We (1) introduce novel mathematical and computational tools that are inherently sensitive to shape and (2) formulate spatially sensitive transformations that simplify application of pre-existing methodologies, such as support vector machines. Our choice of representation, point sets, enable fuller yet lower-dimensional descriptions of data. This representation closely models many real-world knowledge representation needs that benefit from its flexibility. We solve problems in classification, clustering, and regression for many of which spatial knowledge is crucial for obtaining a solution. Furthermore, we demonstrate that previous approaches sometimes ignore the basic most informative aspects of data and in retrospect provide counter-intuitive solutions.;We explore novel and existing measures of similarity between point sets based on exploiting the geometric spatial relationships in the underlying domain between data points. Many of these techniques are built upon innovative ways of extending an intuitive notion of "spatial overlap" between solids to rigorous definitions for sets of points that by definition are zero-dimensional and thus have no overlap. In addition to a study of theoretical aspects of the point set representation we also show extensive demonstrations of its diverse applicability.;In the neuroscience domain we introduce a new framework using these techniques that allows us to reason about individuals, as opposed to populations. We study the problem of detecting minute, short-term changes in white matter structure in the brain and relating them to changes in cognitive test scores and genetic biomarkers. Our results present the first evidence demonstrating that very small changes in white matter structure over a two year period can predict change in cognitive function in healthy adults.;In other domains we present new results and techniques in clustering com- parison, natural language processing, object recognition in images, goodness-of-fit testing, and multivariate point set classification.
Keywords/Search Tags:Data, Point, Spatial
Related items