Font Size: a A A

A framework for representing non-stationary data with mixtures of linear models

Posted on:2003-04-22Degree:Ph.DType:Thesis
University:OGI School of Science & EngineeringCandidate:Archer, Cynthia LouiseFull Text:PDF
GTID:2468390011479210Subject:Computer Science
Abstract/Summary:
In this thesis, we present a latent data framework that facilitates formalizing observations of data behavior into statistical models. Using this framework, we derive two related models for a broad category of real-world data that includes images, speech data, and other measurements from natural processes. These models take the form of constrained Gaussian mixture models. Our statistical models lead to new algorithms for adaptive transform coding, a common method of signal compression, and adaptive principal component analysis, a technique for data modeling and analysis.; Adaptive transform coding is a computationally attractive method for compressing non-stationary multivariate data. A classic transform coder converts signal vectors to a new coordinate basis and then codes the transform coefficient values independently with scalar quantizers. An adaptive transform coder partitions the data into regions and compresses the vectors in each region with a custom transform coder. Prior art treats the development of transform coders heuristically, chaining sub-optimal operations together. Instead of this ad hoc approach, we start from a statistical model of the data. Using this model, we derive, in closed form, a new optimal linear transform for coding. We incorporate this transform into a new transform coding algorithm that provides an optimal solution for non-stationary signal compression.; Adaptive principal component analysis (PCA) is an effective modeling tool for high-dimensional data. Classic PCA models high-dimensional data by finding the closest low-dimensional hyperplane to the data. Adaptive or local PCA partitions data into regions and performs PCA on the data within each region. Prior art underestimates the potential of this method by requiring a single global target dimension for the model hyperplanes. We develop a statistical model of the data that allows the target dimension to adjust to the data structure. This formulation leads to a new algorithm for adaptive PCA, which minimizes dimension reduction error subject to an entropy constraint. The entropy constraint, which derives naturally from the probability model, effectively controls model complexity. Our evaluations show that entropy-constrained adaptive PCA conforms to the natural cluster structure of data better than state-of-the-art modeling methods.
Keywords/Search Tags:Data, Model, PCA, Framework, Adaptive, Transform, Non-stationary, Statistical
Related items