Bayesian models for unsupervised feature selection

Posted on:2013-08-29

Degree:Ph.D

Type:Thesis

University:Northeastern University

Candidate:Guan, Yue

Full Text:PDF

GTID:2458390008972011

Subject:Computer Science

Abstract/Summary:

This dissertation focuses on developing probabilistic models for unsupervised feature selection. High-dimensional data often contain irrelevant and redundant features, which can hurt learning algorithms. One can remove these unwanted features either through removing some subsets of the original features (feature selection) or by transforming data into a lower dimensional feature space. Principal component analysis (PCA) is a popular transformation-based dimensionality reduction method. However, it is not easy to interpret which of the original features are important in PCA. We have designed sparse probabilistic PCA and mixture of sparse probabilistic PCA formulations. By presenting sparse PCA as a probabilistic Bayesian formulation, we gain the benefit of automatic model selection. We examined three different priors for achieving sparsification: (1) a two-level hierarchical prior equivalent to a Laplacian distribution and consequently to an L1 regularization, (2) an inverse-Gaussian prior, and (3) a Jeffreys prior. We learn these models by applying variational inference.;The methods in the unsupervised feature selection literature either select global or local features. Global methods select a single set of features; whereas, local methods or subspace clustering methods select subsets of features (one subset for each cluster where features in different clusters can vary). In this proposal, we provide a unified probabilistic model that can be set to perform global or local feature selection. In our preliminary work, we are able to build such a model for feature selection by tying the priors of our mixture of sparse probabilistic PCA. We propose to develop this unified model further and more directly through a Beta-Bernoulli hierarchical prior on the features by simply adjusting the variance of the Beta prior. We apply this Beta-Bernoulli prior to a Dirichlet process mixture to select features for clustering.;Finally, a single data set may be multi-faceted and can be grouped and interpreted in many different ways (we call views), especially for high dimensional data, where feature selection is typically needed. However, most clustering algorithms produce a single clustering solution; and similarly, feature selection for clustering tries to find one feature subset where one interesting clustering solution resides. In this thesis, we also propose to develop a probabilistic nonparametric Bayesian model that can discover several possible clustering solutions and the feature subset views that generated each cluster partitioning simultaneously.

Keywords/Search Tags:

Feature, Model, Sparse probabilistic PCA, Bayesian, Clustering, Data

Related items

1	Research On Recommendation Method Based On Bayesian Local Probabilistic Matrix Factorization
2	Probabilistic Feature Selection And Classification Vector Machine
3	Probabilistic Graphical Models Based On Data Cleaning
4	Sparse Bayesian Model Based On Text Classfication
5	Semi-supervised clustering: Probabilistic models, algorithms and experiments
6	Radar Imaging Techiniques Based On Sparse Bayesian Reconstruction Methods
7	An Improved Probabilistic Database Model And Its Probabilisticn Earest Neighbors Query Research
8	Research And Application Of Deep Probabilistic Statistics Generative Model
9	Probabilistic Graphical Models For Data-intensive Computing Construction Method And Implementation
10	Research On The Algorithm Of Off-grid DOA Estimation Via Sparse Data Model