Font Size: a A A

Empirical analysis of spare principal component analysis

Posted on:2017-01-14Degree:M.SType:Thesis
University:Rensselaer Polytechnic InstituteCandidate:Mastylo, Damian ZFull Text:PDF
GTID:2468390011498762Subject:Computer Science
Abstract/Summary:
Sparse Principal Component Analysis (SPCA) builds upon regular Principal Component Analysis (PCA) by including a sparsity factor to further reduce the number of dimensions. The goal of this thesis is to demonstrate the benefits of using a SPCA method that focuses on minimizing information loss as opposed to maximizing variance. Current state-of-the-art SPCA methods include TPower, GPower, and Zou's SPCA as implemented in the SpaSM toolkit. These current methods focus on maximizing variance. We hypothesize that the other approach, minimizing information loss, may yield better results in machine learning. We employ a practical approach to examine this problem.;Many optimizations exist via tweaking the sparsity factor, the number of left singular vectors used, or the column subset selection method. Many combinations of these approaches are examined, and their efficacy are reported by comparing information loss, symmetric variance, and the classification accuracy of a Support Vector Machine (SVM) using the transformed data set.
Keywords/Search Tags:Principal component, SPCA, Information loss
Related items