Font Size: a A A

Statistical learning for high dimensional data set with group structure

Posted on:2017-10-29Degree:Ph.DType:Dissertation
University:The University of Wisconsin - MadisonCandidate:Xiong, LieFull Text:PDF
GTID:1458390008952864Subject:Statistics
Abstract/Summary:
This dissertation is devoted to a study of statistical learning for the high-dimensional dataset with group structure. It consists of three parts. In the first part, we consider the variable selection problem in the dataset with missing values and grouped covariates. After the multiple imputation method is applied to handle missing values, we treat the coefficients of a covariate across all imputed datasets as a group, and propose a multiple imputation group LASSO method to consistently select important groups and covariates across all imputed datasets. In the second part, we propose a multilevel hierarchical penalized regression method when a hierarchical group structure exists among covariates. Our method removes unimportant groups effectively and maintains the flexibility of subgroup selection within the identified groups and variable selection within the identified subgroups. The new method provides the potential for achieving the theoretical "oracle" property. In the third part, we consider group structure under the multivariate regression framework. By treating the signals of a covariate across all responses as a group, we propose a multivariate component-wise boosting method to handle the high dimensionality and possible nonlinear associations under the high-dimension-low-sample-size setting.
Keywords/Search Tags:Structure, Method
Related items