Mathematical programming for statistical learning with applications in biology and finance

Posted on:2010-07-24

Degree:Ph.D

Type:Dissertation

University:Princeton University

Candidate:Luss, Ronny

Full Text:PDF

GTID:1448390002982845

Subject:Operations Research

Abstract/Summary:

The primary focus of this work is optimization algorithms for statistical learning tools and, in particular, the development and implementation of large-scale algorithms for sparse principal component analysis (PCA) and kernel optimization.;Sparse PCA seeks sparse factors, or linear combinations of the data variables, explaining a maximum amount of variance in the data while having only a limited number of nonzero coefficients. We first enhance a recent first order algorithm for a semidefinite relaxation to sparse PCA using numerically cheaper approximate gradients, allowing us to work with larger data sets. These results are applied to some classic clustering and feature selection problems arising in biology.;We next examine classification problems, specifically using support vector machines (SVM), which are heavily dependent on the choice of an input kernel matrix. Kernel learning seeks to improve classification performance by minimizing an upper bound on test error over a set of kernel matrices. Current classification methods, such as SVM, require positive semidefinite kernel matrices. We first address this limitation using kernel learning to incorporate indefinite kernels into SVM, and describe algorithms to solve the resulting convex eigenvalue optimization problem. We then discuss algorithms for another recent kernel learning problem called multiple kernel learning (MKL), which learns kernels as convex combinations of predefined positive semidefinite matrices.;The final part of this work shows how text from news articles can be used to predict intraday price movements of financial assets using support vector machines. Multiple kernel learning is used to combine equity returns with text as predictive features in order to improve classification performance. Predictability is observed in the occurrence of abnormal returns, but not in their direction.

Keywords/Search Tags:

Kernel learning, Algorithms, Classification

Related items

1	A Study Of Kernel Classification Algorithm Based On Double-Kernel Combinition
2	Research On Multiple Kernel Learning Algorithms And Their Applications
3	High-performed Kernel Classification Methods Based On Multi-kernel Learning
4	Fast Multiple Kernel Learning For Classification And Application
5	Kernels For Feature Extraction And Research On Nonlinear Multiple Kernel Learning
6	Research On Sar Image Classification Based On Kernel Learning
7	Research On Image Classification Based On Nonlinear Kernel Selfadaptive Learning
8	Research On Multiple Kernel Learning Algorithms And Applications
9	Mathematical programming for statistical learning with applications in biology and finance
10	The Research And Application Of Imbalance Classification Algorithm Based On Kernel Strategy And Deep Learning