Probabilistic Feature Selection And Classification Vector Machine

Posted on:2017-03-28

Degree:Master

Type:Thesis

Country:China

Candidate:C Li

Full Text:PDF

GTID:2308330485951830

Subject:Computer software and theory

Abstract/Summary:

PDF Full Text Request

Bayesian learning, which could provide a natural and unified method to solve a lot of difficult data-modeling problems, becomes an important branch in machine learn-ing. Sparse Bayesian learning (SBL) algorithms are among the state-of-the-art ma-chine learning algorithms, which benefit from incorporating priori assumptions, mak-ing probabilistic predictions, outputting spare solutions, and so on. However, some of the SBL algorithms, such as the relevant vector machine (RVM) and the probabilistic classification vector machine (PCVM), lack in feature selection, which decreases the performances of these algorithms when the data sets contain a large amount of irrele-vant and/or redundant features. To tackle this problem, in my thesis a sparse Bayesian approach is proposed to simultaneously select the relevant samples and features for clas-sification. We call it a probabilistic feature selection and classification vector machine (PFCVM), which adopts truncated Gaussian priors as both sample and feature priors. We choose two methods to derive the optimal solution to the proposed algorithm:use expectation-maximization (EM) algorithm to derive a maximum a posterior (MAP) so-lution; use Laplace’s method to derive a type-Ⅱ maximum likelihood method based fully Bayesian solution. The experiments on the benchmark data sets and the high dimen-sional data sets validate the performances of PFCVM under two criteria:accuracy of classification and effectiveness of selected features. Finally, we analyse the generaliza-tion performance of PFCVM and derive a generalization bound for PFCVM. Then by tightening the bound, we demonstrate the significance of the sparseness for improving the model’s generalization ability.The main work of my thesis could be summarized as follows:(1) Different from traditional Bayesian learning algorithms, the proposed algorithm could both select the relevant features and samples in the training step, and reduce the impact of irrelevant and/or redundant features.(2) In my thesis, we not only introduce the sparseness-promoting priors to the sam-ple weights, but also to the feature weights. Then we use two methods to compute the proposed model’s maximal probability, respectively.(3) The performances of the proposed algorithms are extensively testified in the experiments chapter. We validate the accuracy of prediction and the ability of feature selection on both the benchmark data sets and the gene expression data sets.(4) To demonstrate the generalization of the proposed algorithms, we derive a Rademacher based generalization bound. By tightening the bound, we not only demon-strate the significance of feature selection but also provide a solution to choose the optimal starting points for PFCVM.

Keywords/Search Tags:

Machine learning, Bayesian reasoning, feature selection, probabilistic classification model, sparse learning

PDF Full Text Request

Related items

1	Research On Classification Algorithm Of Semi-supervised Extreme Learning Machine Based On Sparse Bayesian
2	Research On Unsupervised Feature Learning Algorithms Based On Sparse Modeling And Information Theory Learning
3	Bayesian models for unsupervised feature selection
4	Research On Sparse-learning-based Multi-label Feature Selection Method
5	Research On Bayesian Learning Theory And Its Application
6	Low-Rank Feature Selection Algorithm Based On Sparse Learning And Hypergraph
7	Study On The Reasoning Under Uncertainty And Data Analysis Oriented Pattern Recognition Methods
8	Sparse signal processing for machine learning and computer vision
9	SAR Image Segmentation Based On Character Of Sketch And Feature Learning By Mean Field Variational Bayesian
10	Research On Local Classification Methods Based On Bayesian Decision Theory And The Applications