Font Size: a A A

Regression with smoothly clipped absolute deviation penalty

Posted on:2008-08-27Degree:Ph.DType:Dissertation
University:The University of IowaCandidate:Xie, HuiliangFull Text:PDF
GTID:1440390005969035Subject:Statistics
Abstract/Summary:
In linear regression, variable selection becomes important when a large number of predictors are present, for the purpose of interpretation and prediction. As a solution to this, penalized regression can perform variable selection and coefficient estimation simultaneously.; We investigate regression with the smoothly clipped absolute deviation (SCAD) penalty. In particular, we study the asymptotic properties of the SCAD-penalized estimator. In our study, the number of predictors pn is allowed to go to infinity as the number of observations n goes to infinity. Like sample size computation in power analysis, this sheds light on the quality of this method. Our study improves previous results about the SCAD-penalized regression in that we no longer limit our search for the SCAD-penalized estimator to a neighborhood of the true coefficients and still obtain the oracle property of the estimator.; We also extend SCAD-penalized regression to the partially linear model, with the view to get a more interpretable and sparse model in the linear part. Under reasonable assumptions the estimator of the linear coefficients is shown to be consistent in variable selection and asymptotically normal for the nonzero coefficients. At the same time, the estimator of the nonparametric part is globally consistent and can reach the optimal convergence rate in the purely nonparametric setting. Simulation is used to show the finite sample behavior of the estimator.; As another extension, it is applied to the accelerated failure time model. Under certain censoring assumptions, the oracle property continues to hold for the estimator defined as the minimizer of the SCAD-penalized Kaplan-Meier weighted least squares. This extension justifies the use of penalized regression in variable selection and coefficient estimation when the responses are subject to right censoring.; Existing algorithms are adapted to compute the SCAD-penalized least squares estimator in these cases. Large sample theory, matrix inequalities and spline theory are employed to establish the variable selection consistency and asymptotic normality of the SCAD-penalized least squares estimator in these settings. These estimators are also illustrated with real data examples. Finite sample behavior of the estimators is studied via simulation and compared with other widely-used approaches.
Keywords/Search Tags:Regression, Variable selection, Estimator, Linear, Sample
Related items