Topics on supervised and unsupervised dimension reduction

Posted on:2011-12-10

Degree:Ph.D

Type:Dissertation

University:The Pennsylvania State University

Candidate:Artemiou, Andreas A

Full Text:PDF

GTID:1448390002957208

Subject:Statistics

Abstract/Summary:

In the first part of this work, we extend the results by Artemiou and Li (2009) and Ni (2010) in several interesting ways. First we extend them in the case of non random covariance matrix and in case there is a multivariate response in the linear regression setting. Second we try to explore if there is predictive potential of linear principal components in the case of non linear regression functions and especially in the context of sufficient dimension reduction. Third we propose an information criterion that in very limited number of cases can be used to check the predictive potential of linear principal components. Lastly, we explore the predictive potential of kernel principal component in the completely nonparametric regression function Y = f(X) + epsilon where f is an arbitrary function. The most general form of our result, shows that the phenomenon goes far beyond the context of linear regression and classical principal components where it was originally noticed: if nature selects an arbitrary distribution for the predictor X and an arbitrary conditional distribution of the response Y given X, then Y tends to have stronger correlation with higher-ranking kernel principal components than with lower-ranking kernel principal components. These two questions need the arbitrariness of function f and the arbitrariness of matrix Sigma which are achieved by unitary invariance. A small data analysis, shows that this tendency holds in three different databases.;In the second part, SVMIR, a new method for sufficient dimension reduction using inverse regression and support vector machine algorithms is proposed. This method is known to have several advantages, in comparison, to previous inverse regression methods like SIR, SAVE and DR. First, since machine learning methods instead of sample moments are applied in estimating the directions in the Central Dimension Reduction Subspace this method is shown to be robust in the presence of outliers.;Second, through a modification of the objective function that we need to minimize, we can show that dimension reduction without matrix inversion can be achieved. Third, through simulations our method is shown to be robust in departures from ellipticity as well as in the presence of categorical predictors among our predictors. Finally, this method gives us a way of estimating nonlinear directions in the Central Dimension Reduction Subspace, and direction in the feature space using kernel functions. The above are shown in theory, through simulations and by application on real data examples; one to build a regression model for the relative performance of computer CPUs, the second for the classification of E.coli proteins on cellular localization sites.;Key Words and Phrases Kernel principal components; Regression; Unitary invariance; Sufficient Dimension Reduction; Support Vector Machines.

Keywords/Search Tags:

Dimension reduction, Principal components, Regression

Related items

1	Research On Partial Least Squares Dimension Reduction Based Facial Age Estimation
2	Dimension reduction in regression analysis
3	Research On Dimension Reduction Methods Of High Dimensional Data
4	Dimension Reduction Technology Research Based On Text Features
5	Dimension reduction in regression
6	Cluster analysis of high dimensional data and dimension reduction for regression
7	Analysis And Research On The Dimension Reduction Method For Functional Data
8	Research On Fast Face Detection Algorithm Based On PCA Dimension Reduction
9	Dimension Reduction Under Complex Data Environment
10	Immune Clonal Selection Based Dimension Reduction And Applications