Font Size: a A A

Simultaneous variable selection and simultaneous subspace selection for multitask learning

Posted on:2010-09-22Degree:Ph.DType:Thesis
University:University of California, BerkeleyCandidate:Obozinski, Guillaume RomainFull Text:PDF
GTID:2448390002976635Subject:Statistics
Abstract/Summary:PDF Full Text Request
This thesis considers the problem of simultaneous covariate selection and simultaneous subspace selection for a group of learning problems, e.g. regression or classification problems, which are defined over the same covariate space and which are assumed "related" in the sense that a small number of covariate or respectively a small dimensional subspace contains the information relevant to all learning problems. We use a ℓ1/ℓ2 block-regularization scheme that groups coefficients associated with each covariate across different classification problems, so that similar sparsity patterns in all models are encouraged. We propose a blockwise path-following scheme that approximately traces the regularization path and which takes advantage computationally of the sparsity of solutions at high regularization levels.;We then show how to use random projections to extend this approach to the problem of joint subspace selection, where multiple predictors are found in a common low-dimensional subspace and show that our algorithmically efficient scheme approximates the regularization by the trace norm.;Finally, in the context of K-dimensional multivariate linear regression, we study, from a theoretical point of view, the recovery of the union support---defined as the set of covariates relevant to at least one of the K output predictions---using the proposed ℓ1/ℓ2 regularization scheme. We show that the statistical complexity of the problem, measured by the number of observations needed to recover the correct union support with high probability, depends on a function that we introduce and analyze: the sparsity-overlap function. The theory, in accordance with empirical simulations, characterizes the situations in which selecting variables simultaneously based on our scheme is more efficient than selecting them separately.
Keywords/Search Tags:Simultaneous, Subspace selection, Scheme, Covariate
PDF Full Text Request
Related items