Font Size: a A A

High Dimensional Learning with Structure Inducing Constraints and Regularizer

Posted on:2018-07-07Degree:Ph.DType:Thesis
University:University of MinnesotaCandidate:Asiaeetaheri, Amir AsiaeeFull Text:PDF
GTID:2478390020457664Subject:Computer Science
Abstract/Summary:
Explosive growth in data generation through science and technology calls for new computational and analytical tools. To the statistical machine learning community, one major challenge is the data sets with dimensions larger than the number of samples. Low sample-high dimension regime violates the core assumption of most traditional learning methods. To address this new challenge, over the past decade many high-dimensional learning algorithms have been developed.;One of the significant high-dimensional problems in machine learning is the linear regression where the number of features is greater than the number of samples. In the beginning, the primary focus of high-dimensional linear regression literature was on estimating sparse coefficient through l1-norm regularization. In a more general framework, one can assume that the underlying parameter has an intrinsic "low dimensional complexity" or structure. Recently, researchers have looked at structures beyond sparsity that are induced by any norm as the regularizer or constraint.;In this thesis, we focus on two variants of the high-dimensional linear model, i.e., data sharing and errors-in-variables where the structure of the parameter is captured with a suitable norm. We introduce estimators for these models and study their theoretical properties. We characterize the sample complexity of our estimators and establish non-asymptotic high probability error bounds for them. Finally, we utilize dictionary learning and sparse coding to perform Twitter sentiment analysis as an application of high dimensional learning.;Some discrete machine learning problems can also be posed as constrained set function optimization, where the constraints induce a structure over the solution set. In the second part of the thesis, we investigate a prominent set function optimization problem, the social influence maximization, under the novel "heat conduction" influence propagation model. We formulate the problem as a submodular maximization with cardinality constraints and provide an efficient algorithm for it. Through extensive experiments on several large real and synthetic networks, we show that our algorithm outperforms the well-studied methods from influence maximization literature.
Keywords/Search Tags:Structure, Dimensional, Constraints
Related items