Font Size: a A A

The Research Of A Class Of Sparse Logistic Regression Model Under Bayesian Framework

Posted on:2022-02-04Degree:MasterType:Thesis
Country:ChinaCandidate:M C WangFull Text:PDF
GTID:2480306764495044Subject:Computer Software and Application of Computer
Abstract/Summary:PDF Full Text Request
With the development of modern science and technology such as big data and cloud com-puting,massive high-dimensional data has gradually penetrated into all aspects of social life.The feature dimension of high-dimensional data is very large,even far larger than the number of observed samples,which brings great difficulties to data storage,modeling and calculation.However,this kind of data often contains a lot of redundant information.When modeling and analyzing of specific problems,the dimension of the useful features is usually far lower than the feature dimension of data set.Sparse learning is the process selecting the useful features of high-dimensional data to achieve the purpose of information compression.The familiar sparse learning is usually solved from regularization of sparse model and Bayesian sparse learning,and that two methods can be transformed into each other.Based on the framework of Bayesian theory,a relatively complete theory and system have been developed to study the sparse learning problem of linear model.However,the linear model has certain limitations and is not applicable to many problems.In this dissertation,we consider the sparse Bayesian learning problem under generalized linear model,and study the some prop-erties of the sparse model under the Bayesian framework based on Logistic regression model which has a broad application.We consider the generalized double Pareto prior,which has a sharp peak and a heavy tail on the coefficients,to ensure that the coefficients near zero are ob-viously different from the non-zero coefficients,so as to achieve the purpose of sparse learning.We prove that the sparse prior distribution has excellent theoretical properties.First,we reveal the relationship between the maximum a posteriori estimation of the unknown coefficients and the regularization process,and derive the Oracle property,including sparseness and asymptotic normality,of the maximum a posteriori estimation for a given prior.Secondly,we study the property of the posterior distribution of the Logistic regression coefficient under the generalized double Pareto prior from the perspective of Bayesian framework,and show that the estimates of coefficient are asymptotically concentrate around the true sparse vector in the7)2-sense.Finally,the EM algorithm and Gibbs sampling algorithm are used to solve the maximum a posteriori estimation and the posterior mean estimation respectively,and the effectiveness of the proposed methods are verify by numerical simulation.
Keywords/Search Tags:Logistic regression model, sparse learning, Bayesian estimation, EM algorithm, Gibbs sampling algorithm
PDF Full Text Request
Related items