Font Size: a A A

Bayesian learning in classification and density estimation

Posted on:2003-05-29Degree:Ph.DType:Dissertation
University:University of California, San DiegoCandidate:Chan, KwokleungFull Text:PDF
GTID:1468390011489216Subject:Computer Science
Abstract/Summary:
The Bayes' theorem is central to probability manipulation. Use of Bayes' theorem in supervised classification transforms the problem into density estimation and is called the generative approach. Discriminative classifiers on the other hand find direct mappings from input data to class label. Various classifiers from the two categories were compared in a glaucoma standard automated perimetry (SAP) dataset. It showed that when an appropriate density model is employed, the generative classifiers can outperform their discriminative counterparts. In addition, their decisions are easier to understand and interpret through the intermediate steps in probability manipulation.; Besides classification, many other machine learning problems involves building probabilistic models for data, such as clustering, finding hidden correlations and estimating missing values. Independent Component Analysis (ICA) describes data as linear combinations of independent hidden factors. It finds non-trivial projections that could uncover interesting characteristics in the data. A new density model that fits data with clusters of ICA structure was developed. Unlike mixture of factor analysis, the ICA clusters model avoids shattering of clusters into subpieces. The Bayes' theorem is applied again to perform automatic model selection. This makes identifying the number of clusters and component dimensionality possible. Variational method was employed to solve the problem of intractability usually associated with full Bayesian treatment. When applied to the glaucoma SAP dataset, the proposed algorithm discovered interesting patterns that were supported by physiological evidence. Finally, the variational Bayesian ICA was extended to incorporate incomplete data. This helps to analyze small high-dimensional datasets containing missing entries which is of great concern in many machine learning techniques trying to analyze challenging real-world data.
Keywords/Search Tags:ICA, Density, Classification, Data, Bayes' theorem, Bayesian
Related items