Font Size: a A A

High-dimensional profile likelihood inference and covariance matrices estimation

Posted on:2009-03-18Degree:Ph.DType:Thesis
University:Princeton UniversityCandidate:Lam, Wai-FungFull Text:PDF
GTID:2440390002998726Subject:Statistics
Abstract/Summary:
The rapid surge of massive amount of data in various scientific disciplines, like DNA microarray in bioinformatics, high frequency tick-by-tick financial data or world wide web text data for classification to name a few, has carried contemporary statistics to a new era. A main challenge for statisticians is the analysis of such data with tens of thousands of dimensions, when sample size is only of orders of tens or hundreds. Certainly classical statistical methodologies fails to apply.;This thesis explores the treatment of high-dimensional data in two important problems. The first one is a semiparametric regression problem which concerns with estimating both the parametric and nonparametric components of a generalized varying coefficient partially linear model (GVCPLM) when the number of parameters grows with the sample size. Profile likelihood ratio inference for the growing number of parameters is proposed and Wilk's phenomenon is demonstrated. A new algorithm, called the accelerated profile-kernel algorithm, for computing profile-kernel estimator is proposed and investigated. Simulation studies show that the resulting estimates are as efficient as the fully iterative profile-kernel estimates. For moderate sample sizes, this proposed procedure saves much computational time over the fully iterative profile-kernel one and gives stabler estimates. A set of real data is analyzed also using the proposed algorithm.;The second problem is estimation of covariance matrices with size comparable to or even larger than the sample size. This occurs when the number of variables is larger compare to the sample size, which is typical in high-dimensional data. In this case, regularization is needed for getting more accurate estimates. Depending on the case of applications, sparsity priori may occur on the covariance matrix, or its inverse or its Cholesky decomposition. We study these three sparsity exploration problems under a unified framework with a general penalty function. We show that the rates of convergence for these problems under the Frobenius norm are of order (sn log pn/n)1/2, where sn is the number of nonsparse elements, p n is the size of the covariance matrix and n is the sample size. This explicitly spells out the contribution of high-dimensionality is merely of a logarithmic factor. The biases of the estimators using different penalty functions are explicitly obtained. As a result, for the L 1-penalty, to obtain the sparsistency and optimal rate of convergence, the non-sparsity rates must be low: s'n = O( p1/2n ) among O( p2n ) parameters, for estimating sparse covariance matrix, or sparse precision matrix or sparse Cholesky factor and s'n = O(1) for estimating sparse correlation matrix or its inverse, where s'n is the number of the non-sparse elements on the off-diagonal entries. On the other hand, using the SCAD or hard-thresholding penalty functions, there is no such a restriction.;Finally we specialize in inverse covariance matrix estimation on data with a natural ordering or equipped with a "distance" metric, for example longitudinal data or spatial data. We form blocks of parameters based on each off-diagonal of the Cholesky factor from its modified Cholesky decomposition, and penalize each block of parameters using the L2-norm instead of penalizing on individual elements. We develop a one-step estimator, and prove an oracle property which consists of a notion of block sign-consistency and asymptotic normality for the one-step estimator. We also prove an operator norm convergence result with rate explicitly stated, showing the cost of dimensionality is just log pn. The advantage of this method over banding by Bickel and Levina (2008) or nested LASSO by Levina et al. (2007) is that it allows for elimination of weaker signals in between stronger ones in the Cholesky factor. A method for obtaining an initial estimator for the Cholesky factor is discussed, and a gradient projection algorithm is developed for calculating the one-step estimate. Simulation studies suggest that the method works well and can outperform other competing ones. A set of real data is analyzed using the new procedure and compared to banding.
Keywords/Search Tags:Data, Covariance, Sample size, Using, High-dimensional, Cholesky factor
Related items