Non-convex Optimization for Machine Learning: Design, Analysis, and Understandin

Posted on:2018-06-23

Degree:Ph.D

Type:Thesis

University:Princeton University

Candidate:Ma, Tengyu

Full Text:PDF

GTID:2448390005458212

Subject:Artificial Intelligence

Abstract/Summary:

Non-convex optimization is ubiquitous in modern machine learning: recent breakthroughs in deep learning require optimizing non-convex training objective functions; problems that admit accurate convex relaxation can often be solved more efficiently with non-convex formulations. However, the theoretical understanding of non-convex optimization remained rather limited. Can we extend the algorithmic frontier by efficiently optimizing a family of interesting non-convex functions? Can we successfully apply non-convex optimization to machine learning problems with provable guarantees? How do we interpret the complicated models in machine learning that demand non-convex optimizers?;Towards addressing these questions, in this thesis, we theoretically studied various machine learning models including sparse coding, topic models, and matrix completion, linear dynamical systems, and word embeddings.;We first consider how to find a coarse solution to serve as a good starting point for local improvement algorithms such as stochastic gradient descent. We propose efficient methods for sparse coding and topic inference with better provable guarantees.;Second, we propose a framework for analyzing local improvement algorithms that start from a course solution. We apply it successfully to the sparse coding problem.;Then, we consider a family of non-convex functions satisfying that all local minima are also global (and some additional regularity property). Such functions can be optimized by local improvement algorithms efficiently from a random or arbitrary starting point. The challenge that we address here, in turn, becomes proving that an objective function belongs to this class. We establish such results for the natural learning objectives of matrix completion and linear dynamical systems.;Finally, we make steps towards interpreting the non-linear models that require non-convex training algorithms. We reflect on the principles of word embeddings in natural language processing. We give a generative model for the texts, using which we explain why different non-convex formulations such as word2vec and GloVe can learn similar word embeddings with the surprising performance --- analogous words have embeddings with similar differences.

Keywords/Search Tags:

Machine learning, Non-convex, Word embeddings, Local improvement algorithms, Functions

Related items

1	Research On Theory And Algorithms In High-dimensional Non-convex Derivative-free Optimization
2	Research Of Sentiment Classification Based On Attention Word Embeddings
3	Jointly Learning Chinese Word Embeddings With Heterogeneous Morphemes
4	The Study And Application Of Text Embeddings With Deep Learning Technique
5	Research On Text Classification Algorithms Based On Machine Learning
6	Research On Boolean Functions Generation And Cryptanalysis Based On Heuristics And Machine Learning Algorithms
7	Research On Open Relation Extraction And Classification Based On Word Embeddings
8	A Research Of Document Representation And Bilingual Word Embeddings
9	Improving Word Embeddings And Applying Them In Literature Style Recognition
10	Research On Sparse Algorithms And Non-convex Optimization Problems In Machine Learning