Font Size: a A A

Sparsity and dimensionality reduction for kernel regression analysis

Posted on:2004-07-12Degree:Ph.DType:Thesis
University:Rensselaer Polytechnic InstituteCandidate:Momma, MichinariFull Text:PDF
GTID:2468390011459004Subject:Statistics
Abstract/Summary:
Kernel methods such as support vector machines (SVM) and kernel partial least squares (KPLS) have been proven to be very effective for nonlinear inference. A key to success in nonlinear modeling is capacity control. For example, SVM explicitly imposes a convex regularization term in the objective function. KPLS iteratively optimizes the objective function rather than solving the full problem at once.; Dimensionality reduction and sparsity are examples of capacity control. By extracting features, dimensionality reduction methods construct a low-rank approximation of the original data matrix. The model capacity is controlled by a criterion of the feature extraction and reduced rank of the approximation. In kernel methods, a dense model must compute all the elements of the kernel matrix. Imposing sparsity reduces the number of support vectors, making the model less complex and computationally more efficient.; The first goal of this thesis is to provide a framework for sparsifying KPLS solutions. The proposed approach, ν-KPLS, modifies KPLS by introducing the einsensitive loss, which yields sparsity in dual solutions. The ν-KPLS algorithm must solve a convex optimization problem many times. In order to compute the predictive model efficiently, two simple heuristics are proposed.; Challenges in kernel methods involve the selection of a kernel type and kernel parameters, and the improvement of interpretability. This thesis also addresses those issues. We focus on a heterogeneous kernels framework that allows combinations of different types of kernels. Ideas in ensemble methods are utilized to select interpretable kernels and optimize the kernel parameters simultaneously.; It can be shown that PLS solves a least squares problem. However, the squared loss is not necessarily desirable. Rather, it should depend on the data sets. A generic boosting framework, AnyBoost, is utilized to extend the PLS framework to arbitrary loss functions. The resulting generalization of PLS extrapolates the key advantages of PLS to general machine learning tasks. The framework also provides a way for a dimensionality reduction technique based on a generic loss function.
Keywords/Search Tags:Dimensionality reduction, Kernel, KPLS, Sparsity, Methods, Framework, Loss
Related items