Font Size: a A A

Variable Selection via Confidence Regions

Posted on:2011-02-25Degree:Ph.DType:Dissertation
University:North Carolina State UniversityCandidate:Gunes, FundaFull Text:PDF
GTID:1440390002457081Subject:Statistics
Abstract/Summary:
Recently, penalized regression methods for variable selection have become popular. These methods continuously shrink the model parameters and perform selection by setting coefficients to zero. They control the shrinkage of the model parameters by a non-negative regularization parameter and as this parameter increases, the regression coefficients shrink continuously. An important issue in using a penalized regression method is choosing the value of the regularization parameter, or tuning the procedure. Asymptotic results for methods such as SCAD and adaptive LASSO are available by letting the tuning parameter change at an asymptotic rate. However, in finite samples, the value of the tuning parameter needs to be chosen, and which criterion to use in practice is a difficult question.;Chapter 1 proposes an approach to tuning penalized regression variable selection methods by calculating the sparsest estimator contained in a confidence region of a specified level for the classical fixed dimensional case, n > p. Because confidence intervals/regions are generally well understood, tuning penalized regression methods in this way is intuitive and easily understood by scientists and practitioners. More importantly, our work shows that tuning to a fixed confidence level of 95% performs better than tuning via the common methods based on AIC, BIC, or cross-validation over a wide range of sample sizes and levels of sparsity. Additionally, we prove that by tuning with a sequence of confidence levels converging to one, asymptotic selection consistency is obtained; and with a simple two-stage procedure, an oracle property is achieved. The confidence region based tuning parameter is easily calculated using output from existing penalized regression computer packages.;Although penalized regression methods are more popular, in practice stepwise regression methods, such as forward or backward regression are widely used. The typical way to choose the final model for stepwise regression methods is based on a fixed entrance level alpha, which usually introduces large Type I errors. Chapter 2 proposes to use confidence region based tuning specifically for forward regression, which can control the Type I errors at a moderate level.;Under high dimensionality, classical tuning methods such as AIC, BIC, and other resampling based methods such as cross validation, tend to choose models with too many spurious variables; therefore tuning a model selection procedure is even more challenging. Chapter 3 shows that although tuning via the confidence region method requires n > p, it can be used in combination with a sure-screening method for the ultra-high dimensional case, p >> n, and thus provides a powerful and intuitive approach to variable selection...
Keywords/Search Tags:Variable selection, Confidence region, Penalized regression, Regression methods, Tuning, Parameter, Via, Model
Related items