Towards tractable parameter-free statistical learning

Posted on:2005-09-01

Degree:Ph.D

Type:Dissertation

University:University of Southern California

Candidate:D'Souza, Aaron Angelo

Full Text:PDF

GTID:1458390008498980

Subject:Computer Science

Abstract/Summary:

The objectivity of statistical analysis hinges on the assumptions made about the form and complexity of the model used to fit the data. These usually take the guise of "nuisance parameters" which must be set based on some meta-level knowledge of the problem to be solved. This dissertation seeks to contribute statistical methods which require as little meta-level knowledge as possible, and yet are computationally and analytically tractable enough to operate on real-world datasets.; This goal is partially achieved within the framework of Bayesian statistics, which allows the specification of prior knowledge, and lets the data correctly constrain model complexity. However, for all but the simplest of statistical models, a full Bayesian treatment is often analytically and computationally intractable. We therefore explore the usefulness of approximation techniques; in particular, those stemming from variational calculus, to gain analytical tractability when performing statistical inference in complex graphical models.; We provide a novel, analytically closed-form solution to estimating the cardinality of mixture models, by locally approximating the evidence for splitting existing models, and thus growing complexity as needed. We contribute a solution to the problem of estimating forgetting rates for online learning by modeling the non-stationarity of the model as a set of drifting parameters, thus allowing a variational Kalman smoother to estimate the time scale of the process drift. We also address the estimation of Bayesian distance metrics for locally weighted regression---a problem commonly known as supersmoothing---by probabilistically modeling the kernel weights assigned to the data.; Another contribution of this dissertation is the development of statistical inference methods which are computationally scalable. We derive a probabilistic version of back-fitting---a highly robust and scalable class of supervised non-parametric algorithms---and demonstrate that, among others, the framework of sparse Bayesian learning arises from this class as a special case.; We conclude that in several difficult statistical learning problems, principled approximation techniques, and careful model construction can create scalable and robust algorithms which eliminate the most difficult model complexity parameters, while retaining their applicability to large, complex and underconstrained data sets.

Keywords/Search Tags:

Statistical, Model, Complexity, Data

Related items

1	Research On The Evaluation And Estimation Of The Complexity Of Statistical Learning Models
2	Statistical Complexity Measureanalysis Of ECG Signal Based On LMCD And JSD
3	Research On Steganalysis Of Images And Videos Based On Statistical Features
4	Research On Methods Of Learning Statistical Relational Model
5	Researching Of The Mogolian Language Model Based On Speech Recognition
6	Study On Radar Clutter's Characteristic And Signal Processing Method Based On Statistical And Complexity Theory
7	Research On Video Decoding Complexity Modeling And Prediction
8	The Effects Of Data Imbalance On The Performance Of Data Complexity Measures
9	Research And Application Of Image Complexity Based On Convolution Neural Networks
10	Statistical analysis of relational data: Mining and modeling complex networks