Font Size: a A A

New Techniques for High-Dimensional and Complex Data Analysis Based on Weighted Learning

Posted on:2014-01-24Degree:Ph.DType:Thesis
University:North Carolina State UniversityCandidate:Shin, Seung JunFull Text:PDF
GTID:2458390005995261Subject:Statistics
Abstract/Summary:
We develop new statistical tools for high-dimensional and complex data which have been very common in many applications. The thesis consists of four topics and a common thread which links all the inter-related topics is weighted learning.;In the first two chapters, we establish the joint piecewise linearity of two popular kernel machines, the weighted support vector machine (WSVM) and the kernel quantile regression (KQR), which possess additional parameters besides the regularization parameter, a weight parameter and a quantile parameter, respectively. In Chapter two, joint piecewise linearity of the WSVM solution is established and then an associated algorithm which efficiently computes entire solution surfaces of the WSVM is proposed. In Chapter three, a piecewise linear conditional survival function estimator is proposed based on the two-dimensional solution surfaces of the censored kernel quantile regression which can be viewed as a special case of the weighted KQR.;In the remaining two chapters, we study sufficient dimension reduction (SDR) in binary classification. While SDR has been extensively explored in the context of regression with continuous response, SDR in binary classification where most of existing SDR methods suffer has not been thoroughly researched. We propose two novel SDR estimation methods in the context of binary classification. In Chapter four, a probability-enhanced SDR scheme is proposed. The key idea is to slice data based on the conditional class probability rather than the binary response. Such a probability-based slicing can be conveniently done by solving a sequence of WSVMs. In Chapter five, we develop a weighted principal support vector machine (WPSVM) for SDR in binary classification by extending the idea of the principal support vector machine (PSVM) recently developed by Li et al. (2011) in the context of regression. The proposed WPSVM successfully achieves SDR with binary responses and can handle both linear and nonlinear SDR in a unified framework.
Keywords/Search Tags:SDR, Data, Weighted, Binary, Support vector machine, Proposed
Related items