Font Size: a A A

The Parameter Estimation And Variable Selection In High Dimensional Collinearity Models

Posted on:2015-09-27Degree:DoctorType:Dissertation
Country:ChinaCandidate:Y DongFull Text:PDF
GTID:1220330467985986Subject:Financial Mathematics and Actuarial
Abstract/Summary:PDF Full Text Request
Nowadays, variable selection is playing an important role in the study of diverse statistical research, especially in high dimensional data. The researchers usually select as many as possible predictor variables in the model to improve the accuracy of prediction, but in fact there are a few variables that have great influence on the model. Therefore, it is important to choose a sparse model for future prediction. It is always desirable to include only significant variables to ensure accuracy in the proposed model. This can be achieved by adding a penalized function to the objective function which will not only shrink the active coefficients or keep them in the original model, but also shrink the inactive coefficients to zero. This will also enable us to finish the variable selection and the parameter estimation simultaneously. It can also significantly improve the efficiency of computing. The existence of high correlation among variables in high dimensional data set can cause a serious problem of collinearity, therefore the main focus of this study is to resolve this issue.This dissertation shows our results about the parameter estimation and variable selection in high dimensional collinearity models. The main work of this thesis is organized as following. The joint mean and variance model with combined-penalization is discussed in Chapter2. The effectiveness of the mean parameters estimation depends on the variance parameters estimation in regression models. Under some certain conditions, the consistency and the asymptotic nor-mality properties of the estimation in proposed model are proved. Two penalization techniques are combined to cater the problem of collinearity in high dimensions because such structure can contribute well in prediction accuracy. Chapter3presents the methodology of variable s-election and parameter estimation in the generalized linear model with combined penalization having diverge number of parameters. The asymptotic property of the estimator for the model is established. Simulation studies and real data analysis demonstrated that the performance of combined penalization is good when there are high correlations among predictors. In Chapter4, the variable selection in the ultra-high dimensional cases under the generalized linear model is discussed. The theoretical properties of the estimators are achieved using combination of SCAD and Ridge penalties. Moreover, with some mild conditions, this methodology can be used to develop the consistency of a real model. Finally in Chapter5, a novel Mixed-GLM model is proposed. This model is very accurate and useful. It also can be used to investigate the features of the individual from the mixed model. The consistency and asymptotic normality of the M-estimation for Mixed-GLM are proved. Several numerical studies are conducted to evaluate the finite sample performance of the proposed model and showed good performance among their competitors.
Keywords/Search Tags:Variable selection, High dimensional data, The generalized linear mod-el(GLM), Joint mean and variance model, Mixed-GLM, Combined-penalization
PDF Full Text Request
Related items