Font Size: a A A

Convex Optimization Algorithms and Recovery Theories for Sparse Models in Machine Learning

Posted on:2015-09-04Degree:Ph.DType:Thesis
University:Columbia UniversityCandidate:Huang, BoFull Text:PDF
GTID:2458390005481416Subject:Mathematics
Abstract/Summary:
Sparse modeling is a rapidly developing topic that arises frequently in areas such as machine learning, data analysis and signal processing. One important application of sparse modeling is the recovery of a high-dimensional object from relatively low number of noisy observations, which is the main focuses of the Compressed Sensing [14,22],Matrix Completion(MC) [13,34,68] and Robust Principal Component Analysis (RPCA) [12]. However, the power of sparse models is hampered by the unprecedented size of the data that has become more and more available in practice. Therefore, it has become increasingly important to better harnessing the convex optimization techniques to take advantage of any underlying "sparsity" structure in problems of extremely large size.;This thesis focuses on two main aspects of sparse modeling. From the modeling perspective, it extends convex programming formulations for matrix completion and robust principal component analysis problems to the case of tensors, and derives theoretical guarantees for exact tensor recovery under a framework of strongly convex programming. On the optimization side, an efficient first-order algorithm with the optimal convergence rate has been proposed and studied for a wide range of problems of linearly constraint sparse modeling problems.;In Chapter 2, we generalize matrix completion and matrix RPCA models and rigorously study tractable models for provably recovering low-rank tensors. Unlike their matrix-based predecessors, current convex approaches for recovering low-rank tensors based on incomplete (tensor completion) or/and grossly corrupted (tensor robust principal analysis) observations still suffer from a lack of theoretical guarantees, although they have been used in various recent applications and have exhibited promising empirical performance. In Chapter 2, we attempt to fill this gap. Specifically, we propose a class of convex recovery models (including strongly convex programs) that can be proved to guarantee exact recoveries under certain conditions.;In all of the sparse models for low-rank tensor recovery problems discussed in Chapter 2, the most popular convex relaxations currently being used minimize the sum of the nuclear norms (SNN) of the unfolding matrices of the tensor. In Chapter 3, we show that this approach can be substantially suboptimal: reliably recovering a K-way tensor of length n and Tucker rank r from Gaussian measurements requires O( rnK-1) observations. In contrast, a certain (intractable) non-convex formulation needs only O(rK +nrK) observations. We introduce a simple, new convex relaxation, which partially bridges this gap. Our new formulation succeeds with O( rk2 n nk2 ) observations. The lower bound for the SNN model follows from our new result on recovering signals with multiple structures (e.g. sparse, low rank), which demonstrates the significant sub-optimality of the common approach of minimizing the sum of individual sparsity inducing norms (e.g. l1 norm, nuclear norm). Our new formulation for low-rank tensor recovery shows how the sample complexity can be reduced by designing convex regularizers that exploit several structures jointly.;In Chapter 4, we propose and analyze an accelerated linearized Bregman (ALB) method for solving the sparse models discussed in Chapter 2 and 3. This accelerated algorithm is based on the fact that the linearized Bregman (LB) algorithm first proposed by Stanley Osher and his collaborators is equivalent to a gradient descent method applied to a certain dual formulation. We show that the LB method requires O(1/epsilon) iterations to obtain an epsilon-optimal solution and the ALB algorithm reduces this iteration complexity to O(1/ 3 ) while requiring almost the same computational effort on each iteration. Numerical results on compressed sensing and matrix completion problems are presented that demonstrate that the ALB method can be significantly faster than the LB method.;In Chapter 5, we extend the arguments of Chapter 4 and apply the linearized Bregman (LB) method to solving linear programming problems. Numerical experiments show that neither the LB method or the accelerated LB method works well on linear programming problems, especially on real data sets. Inspired by the observation that the linearized Bregman, when solving linear programming problems, can be viewed as a sequence of box-constrained quadratic minimizations that are restricted to different supports of the solution variable, we propose new algorithms that combine the Conjugate Gredient/BFGS update with the linearized Bregman method. Numerical experiments on both synthetic and real data sets were conducted, and the new CGLB algorithm not only significantly outperformed the accelerated linearized Bregman proposed in Chapter 4, but was also comparable with the MATLAB solver on small-medium scale real data sets.
Keywords/Search Tags:Sparse, Convex, Linearized bregman, Real data sets, Chapter, LB method, Recovery, Algorithm
Related items