Font Size: a A A

Large Sample Properties Of Ridge Estimators Of Regression Parameters In Linear Regression Models With Missing Data

Posted on:2011-07-04Degree:MasterType:Thesis
Country:ChinaCandidate:D DongFull Text:PDF
GTID:2120360305978004Subject:Probability theory and mathematical statistics
Abstract/Summary:PDF Full Text Request
In practice, some data may be missing for various reasons such as artificial or other unknownfactors, which could happen in opinion polls, market research, medical studies and socio-economicstudies. Recently, statistical inference for missing data is an important and popular research field.In such missing data circumstances, the usual inferential procedures for complete data sets cannotbe applied directly. It needs to do some treatments on data before we can use usual statisticalapproaches. The common methods to treat incomplete data are Complete-Case method and im-putation method. The Complete-Case method deletes all items with non-response at first, anduse the rest data as"complete data"to which the usual inferential procedures will be applied.The imputation method is divided into two classes-deterministic imputation method and stochas-tic imputation method, which are used to impute each missing response so that we can obtain a"complete data"set and apply the usual inferential procedures.The linear model has a strong background in practical applications, which has been usedwidely to analyze data in many research fields such as medical science, biology, economics, fi-nance, environmental science and engineering technology. The least square estimate (LSE) occu-pies a central position in the theory of estimation in a linear model, but it becomes unsatisfactorywhen the design matrix is a degenerate matrix or a nearly degenerate matrix. So some researchersgive a new estimation–ridge estimate, which is widely used to solve statistical inference when thedesign matrix is a degenerate matrix or a nearly degenerate matrix.In 1970, Hoerl and Kennard (Ridge regression biased estimation for non-orthogonal prob-lems[J]. Technometrics, 1970, 12: 55-57.) proposed the ridge estimateβ(k) = (S + kI)?1X Y toimprove LSE, where k > 0, S = X X, X is the design variable matrix, Y is the response variablematrix and I is the identity matrix. The research and application of ridge estimation have attractedextensive attentions, and the ridge estimate has become the most in?uential biased estimate. Theearly results of theoretical research of the ridge estimation can be seen in these papers: Hoerl andKennard's (Ridge regression biased estimation for non-orthogonal problems[J]. Technometrics,1970, 12: 55-57.), Farebrothers's (Further result on the mean squared error of ridge regression[J].J Roy Statist Soc B, 1976, 38: 248-259.). A systematical summary of the ridge estimation can be seen in Wang etal. (Theory and Application of the Linear Model [M]. Hefei, Anhui EducationPress, 1987; Introduction to Linear Model[M]. Beijing: Higher Education Press, 2004.), whichgave a series of sufficient conditions that ridge estimation is superior to LSE. Dai (The conditionsthat ridge estimation is superior to least square estimation[J]. Mathematical Statistics and AppliedProbability, 1994, 9(2): 53-58.) discussed the problem for the superiority of ridge estimation overLSE in the sense of mean square error. He obtained same necessary conditions under which ridgeestimate is superior to LSE. Wang (The consistency of ridge regression [J]. Mathematical Statis-tics and Applied Probability, 1987, 3(1): 42-51.) discussed the consistency of ridge regression,some limit properties of estimates of error variance based on the ridge regression and proved that itshares the same properties with LSE. Many researchers proposed some improved ridge estimatesin order to improve the accuracy of the estimates in terms of the mean square error.In a linear model with equality restrictions, as Zheng (Restricted linear estimations[J]. Chi-nese Journal of Applied Probability and Statistics, 1986, 2(1): 5-12.) said, the mean square error ofβ?, the restricted LSE ofβ, may be larger under certain conditions, which leads to unsatisfactoryresults. It made people to find a class of reasonable estimators to improveβ? from the biased esti-mators ofβ. Lei (Convergence of the ridge estimations of the linear model[J]. Journal of GuangxiNormal University, 1999, 10(1): 21-24.) studied the strong convergence, weak convergence andmean square convergence of the ridge regression in a linear model with the restriction Rβ= 0,and obtained the necessary and sufficient conditions for weak convergence and sufficient condi-tions for strong convergence. Shi (The conditional ridge-type estimation of regression coefficientin restricted linear regression model[J]. Journal of Shanxi Normal University (Natural ScienceEdition), 2001, 15(4): 10-16.) proposed a new ridge-type estimatorβ?(k) = (kW + I)?1β? inrestricted linear regression model of homogeneous equations Rβ= 0, proved this estimator is su-perior to restricted LSE ofβunder certain conditions, and showed that it is restricted admissible.Nong etal. (The conditional ridge― type Estimation of regression coefficient in restricted linearregression model of nonhomogeneous equations[J]. Journal of Sichuan Normal University (Nat-ural Science), 2007, 30(6): 721-725.) proposed a class of ridge― type estimators of regressioncoefficients in a restricted linear regression model with nonhomogeneous rstriction Rβ= r. Theydiscussed the statistical properties of the estimator and the relationship between ridge― type es-timator and LSE. They also showed that it is superior to the restricted LSE under some regularityconditions and optimal criterions.In practice, data-missing occurs frequently, but the issue of the statistical inference for ridgeestimation of regression coefficients in the restricted linear regression model with missing datahas not been studied. In chapter 2, we consider the restricted linear regression model withfixed designs. In the case of incomplete data with missing responses, three different methodsto treat missing data are considered, namely the method to delete all units with missing responses,"complete data"based on deterministic imputation and"complete data"based on random im- putation. Based on these methods, we propose three estimates for ridge estimation of regressioncoefficients, study the strong, weak consistency of these estimates, and also study the strong, weakconsistency and the asymptotic normality of any linear functions of these estimates. In chapter3, we consider the restricted linear regression model with random designs. In the case of incom-plete data with missing responses, three different methods to treat missing data are considered, themethod to delete all units with missing responses,"complete data"based on deterministic impu-tation and"complete data"based on random imputation. Based on these methods, we proposethree estimates for ridge estimation of regression coefficients, study the strong, weak consistencyof these estimates, and also study the strong, weak consistency and the asymptotic normality ofany linear functions of these estimates.Here we summary some new findings in this thesis:1. Under MAR missing mechanism, we study the large-sample properties of estimates forridge estimation of regression coefficients in the restricted linear regression model with fixed de-signs. In the case of incomplete data with missing responses, three different methods to treatmissing data are considered. Bases on these methods, we propose three estimates for ridge esti-mation of regression coefficients, study the strong, weak consistency of these estimates, and alsostudy the strong, weak consistency and the asymptotic normality of any linear functions of theseestimates.2. Under MAR missing mechanism, we study the large-sample properties of estimates forridge estimation of regression coefficients in the restricted linear regression model with randomdesigns. In the case of incomplete data with missing responses, three different methods to treatmissing data are considered. Bases on these methods, we propose three estimates for ridge esti-mation of regression coefficients, study the strong, weak consistency of these estimates, and alsostudy the strong, weak consistency and the asymptotic normality of any linear functions of theseestimates.
Keywords/Search Tags:missing data, linear model, ridge estimation, consistency, asymptotic nor-mality, MAR missing mechanism
PDF Full Text Request
Related items