Font Size: a A A

Extending The Scope Of Sufficient Dimension Reduction Theory And Its Related Methods

Posted on:2011-07-02Degree:DoctorType:Dissertation
Country:ChinaCandidate:Z YuFull Text:PDF
GTID:1100360305498958Subject:Probability theory and mathematical statistics
Abstract/Summary:PDF Full Text Request
This dissertation is devoted to theoretical extensions and methodology development in the literature of sufficient dimension reduction.There are two main focus in the literature of sufficient dimension reduction. The first topic is to estimate the basis directions of the Central (Mean) Subspace. However, the asymptotic property of the estimated directions based on classic methods is not very clear. To better understand the theoretical properties of existing sufficient dimension reduction methods especially for their estimated directions, we first derive the second order asymptotic expansions for their sample estimators of candidate matrix and basis directions. We take four most commonly used sufficient dimension reduction methods Sliced Inverse Regression (SIR, Li,1991), Sliced Average Variance Estimation (SAVE, Cook and Weisberg,1991), Principal Hessian Direction (PHD, Li,1992) and Directional Regression (DR, Li and Wang,2007) for illustration. As an application of the asymptotic results, the following task is to reduce the bias, a specific aim being removal of the leading bias of O(n-1). With the help of the second order asymptotic expansions of these methods, we can easily deduct the general formulae for their second order biases. Moreover, the second order bias for each method can be consistently estimated using the obvious sample analogues. Subtracting the sample estimators of the second order biases from the original estimators, we can construct the bias corrected estimators for these sufficient dimension reduction methods, which will be unbiased to O(n-1).How to determine the structural dimension is another critical issue. Each classic method in determining structural dimension has its own shortcoming. Sequential tests relies heavily on the significance level. Bootstrap method is computationally intensive. Even Bayes information criterion proposed by Zhu, Miao and Peng (2006) can consistently estimate the structural dimension, the optimal form of its penalty function is difficult to identify in a data-driven manner. Moreover, traditional sufficient dimension reduction methods separate directions estimation and structural dimension determination into two steps. Our second contribution is to propose a sparse eigen-decomposition strategy by shrinking small sample eigenvalues to zero. Its main idea is to formulate the spectral decomposition of a kernel matrix into a least squares structure so that we can impose a penalty on sample eigenvalues. The adaptive Least Absolute Shrinkage and Selection Operator (Zou,2006) is recommended to produce sparse estimation of the eigenvalues so that we can estimate the structural dimension efficiently. Different from existing methods, the new method can simultaneously estimate basis directions and structural dimension of the Central (Mean) Subspace in a data-driven manner. The oracle property of our estimation procedure is also established.The third part of this thesis is the B-spline approximation to the kernel matrix of classical sufficient dimension reduction methods SIR and SAVE. Compared with slicing method and kernel estimation that has been used in the literature, B-spline approximation is of higher accuracy and is easier to implement. In addition we suggest a a modified Bayes information criterion based on Zhu, Miao and Peng (2006) to estimate the structural dimension. The key idea here to choose the penalty function is to make the leading term and the penalty term comparable in magnitude. This modified criterion can help to enhance the efficacy of estimation.Dimension reduction for semiparametric regressions involves two issues: recovering informative linear combinations of predictors and selecting contributing predictors. The first goal can be achieved by sufficient dimension reduction. The second goal is the current hot topic in statistics: variable selection. The fourth part of this thesis is to develop a new method to fulfill simultaneously sufficient dimension reduction and variable selection. Inspired by the ground breaking work of Candes and Tao (2007), We here suggest the (?-)1-regularization of sliced inverse regression by the Dantzig selector. The new proposal is designed to strike a balance between nearly solving the eigen-decomposition form of SIR and minimizing the (?)1 norm of the directions. Moreover, our new method can work efficiently even when p> n, where p is the dimension of predictors and n is the sample size. When the dimension p is fixed, the asymptotic properties (consistency in estimation and convergence in distribution) of the proposed estimators are investigated. Moreover, a bound on the estimation error is derived when both the sample size n and the dimension p tend to infinity, which allows p> n as a special case.Finally we study how to test contributing predictors based on directional regression in a model free setting. We derive the asymptotic distribution of the test statistics proposed here under null hypothesis. Based on the asymptotics of the test statistics, we further propose two easy ways to implement variable selection. The new predictor selection proposals is different from any other hot variable selection methods like the Least Absolute Shrinkage and Selection Operator (LASSO, Tibshirani,1996) since it is model free and it does not involve any penalties. The two new approaches will correctly identify significant variable with probability tending to 1 under certain conditions. Moreover, we compare each proposal in this dissertation with its related alternatives by comprehensive simulation studies to illustrate the efficiency of our proposals. We also demonstrate the use of our proposals through a wide range of applications in real data analysis, such as hitters'salary data, horse mussel data, lymphoma data and Boston housing data.
Keywords/Search Tags:B-spline, Bias Correction, Eigen-Decomposition, Second Order Asymp-totics, Structural Dimension, Sparsity, Sufficient Dimension Reduction, Variable Selection
PDF Full Text Request
Related items