Font Size: a A A

Variable Selection For High-Dimensional Regression And Pairwise Screening

Posted on:2021-01-24Degree:MasterType:Thesis
Country:ChinaCandidate:C HuiFull Text:PDF
GTID:2370330602994288Subject:Probability theory and mathematical statistics
Abstract/Summary:PDF Full Text Request
With the continuous emergence of high-dimensional data and the demand for big data analysis,variable selection in linear regression models becomes more and more popular in statistics field.How to select the truly important variables from the complex and diverse predictors is very important and challenging.This paper reviews the research results of variable selection of linear regression in statistics.In variable selection,most existing screening methods focus on marginal efforts and ignore dependence between covariates.However,the method proposed in[5]considers pairwise effects in covariates for screening and penalization.This method relies on the asymptotic distribution of the maximal absolute pairwise sample correlation among independent covariates.The novelty of the theory lies in that the convergence is with respect to the dimensionality p,and is uniform with respect to the sample size n.Moreover,an upper bound for the maximal pairwise R squared can be obtained when regressing the response onto two different covariates.The method in[5]is based on these extreme value results.Furthermore,by combining the pairwise screening with Sure Independence Screening[4],a new regularized variable selection procedure is given in[5].Under certain conditions,this method satisfies the Oracle property.This paper is organized as follows.In Chapter 1 we introduce the background and content of this paper.In Chapter 2 we review some classical methods of variable selection in linear model,and further introduce the regularization methods such as Lasso,ridge regression,and other variable selection methods such as Bayesian,clustering,etc.From Chapter 3 to Chapter 5,we focus on the pairwise screening method proposed in[5],including its theoretical basis and properties.In Chapter 6,the accuracy of variable selection and prediction of this method are discussed by simulation data.At last,in Chapter 7 we analyze strengths and weaknesses of this new method,and talk about future relevant work.
Keywords/Search Tags:pairwise screening, penalized regression, SIS method, variable selection
PDF Full Text Request
Related items