Font Size: a A A

Controlling variable selection by the addition of pseudo -variables

Posted on:2005-05-20Degree:Ph.DType:Dissertation
University:North Carolina State UniversityCandidate:Wu, YujunFull Text:PDF
GTID:1450390008977425Subject:Statistics
Abstract/Summary:
Many variable selection procedures have been developed in the literature for linear regression models. We propose a new and general approach, the False Selection Rate (FSR) method, to control variable selection with the advantage of being applicable to a broader class of regression models; for example, binary regression, Poisson regression, etc. By adding a number of pseudo-variables to the real set of data and monitoring the proportion of pseudo-variables falsely selected in the model, we are able to control the model false selection rate, selecting as many important variables as possible while selecting a relatively low proportion of false important ones. We focus on forward selection because it is applicable in the case where there are more variables than observations. Due to the difficulty of obtaining analytical results, we study our approach by Monte Carlo and compare it with a variety of commonly used procedures. We first focus on linear regression models, and then extend the approach to logistic regression models. The new method is illustrated on four real data sets.
Keywords/Search Tags:Variable selection, Regression models
Related items