Font Size: a A A

Robust Model Selection And Model Averaging For Linear Models

Posted on:2022-01-11Degree:DoctorType:Dissertation
Country:ChinaCandidate:Y F GuoFull Text:PDF
GTID:1480306728485264Subject:Statistics
Abstract/Summary:PDF Full Text Request
Model selection is one of the important research directions of statistics,and has important applications in econometrics,finance and other fields.Scholars have proposed the method of model averaging or prediction combination to avoid the uncertainty in the model selection process and the risk of choosing a poor model.Since the model average estimation is a weighted average of the estimates of all candidate models,the model average estimation has the advantage of being more robust.Ordinary Least Squares(OLS)is a commonly used estimation method.Although OLS is the estimation with the smallest variance among all unbiased estimates,it usually has the following problems:(1)Outliers exist in the dependent variable y direction or(and)random error terms that violate the assumption of normality;(2)There is multicollinearity between the independent variables(abbreviated as collinearity);(3)There are both outliers and collinearity.The above three problems may exist in many real data,especially in the more popular big data and complex data.The commonly used model selection and model averaging methods based on least square estimation will also be affected by these problems,so,Robust model selection and model averaging methods need to be further studied.This article will focus on the robust model selection methods and model averaging methods under problems(1),(2),(3).The specific research results include the following part:1)For the problem(1),that is,there are outliers in the dependent variable y,this paper proposes a model averaging method SMA(Sp Model Averaging)that is robust to outliers based on the Sp criterion.The proposed SMA method combines two methods of classic model averaging and robust model selection.It not only considers the influence of outliers on model selection,but also fully considers the uncertainty in the model selection process compared with the robust model selection method.Through a large number of simulation study,we show the advantages of the proposed method over some common methods.In the case of outliers,the SMA method proposed in this paper is consistently superior to some common model selection and model averaging method with respect to the MSE criterion.Even if there are no outliers in the data,the SMA method can be still very close to the optimal method in this comparison.Finally,the real data analysis of Stack loss further verifies the practicability and effectiveness of the proposed method.2)In response to the multi-collinearity problem in problem(2),this paper extends the Rp model selection method based on ridge estimation to model averaging,and proposes a new model averaging method,RMA(Rp Model Averaging).In the presence of multi-collinearity,statistical predictions are still made robustly.It is better than some common model selection and model averaging methods in the sense of mean square error.The finite sample property of the methods of model selection and model averaging mentioned in this article and commonly used are explored through Monte Carlo simulation.The RMA method is significantly better than the common model selection and model averaging methods,as well as the SMA method proposed in this article when there exists collinearity,especially when the sample size is small and the variance is relatively large;the performance of RMA is almost the same as the optimal method in this paper without collinearity,so the superiority of the proposed method is confirmed.Finally,the feasibility of the RMA method is verified by analyzing the real data of Hald cement.3)For the problem(3),that is,the coexistence of outliers and multi-collinearity,this paper proposes a robust model selection criterion based on the robust ridge M estimation method,RMp,and furthermore,a robust model averaging method RMMA(RMp Model Averaging)is proposed based on the RMp criterion.It shows that the method proposed in this article is effective compared to other commonly used model averaging and model selection methods via simulation research.Specifically,when outliers and collinearity coexist,the performance of the RMMA method is almost consistent and optimal;when there are only outliers,RMMA is almost the same as the optimal SMA or Sp;when there is only collinearity,the difference between RMMA and the optimal RMA or Rp is very small;in the absence of outliers and no collinearity,RMMA is very close to the optimal MMA or Cp method.The real data analysis of Tobacco further illustrates the practicality of the RMMA method.The innovations of this article are:(1)This paper proposes a model averaging method SMA that is robust to outliers,which alleviates the influence of outliers on the average estimation of the model to some extent,and verifies the effectiveness of the proposed method through numerical simulation and real data analysis;(2)In the presence of multi-collinearity in the data,a model averaging method RMA that is robust to collinearity has been developed,which effectively resists the collinearity problem in model averaging.Simulation research and specific real data analysis show that the RMA method is superior to compared with some common methods when there exists collinearity;(3)For the problem of the coexistence of outliers and collinearity in the data,a new robust model selection method RMp is proposed,and based on this,the corresponding robust model averaging method RMMA under the coexistence of outliers and collinearity is obtained;Monte Carlo simulation and actual data analysis show that RSMA can provide a guarantee mechanism for the model averaging method.
Keywords/Search Tags:Outliers, Multi-collinearity, Model selection, Model averaging, Ridge estimator, M estimator, Ridge M estimator
PDF Full Text Request
Related items