Font Size: a A A

Kernel Principal Component Analysis Based On Nystr(?)m Sub-Sampling

Posted on:2022-03-01Degree:MasterType:Thesis
Country:ChinaCandidate:J Q XiangFull Text:PDF
GTID:2480306509484364Subject:Computational Mathematics
Abstract/Summary:PDF Full Text Request
Principal component analysis(PCA)is a method to reduce the dimensionality of data.PCA uses orthogonal transformation to capture the main linear characteristics of the data,reducing redundant data and increasing the interpretability of the data as much as possible.However,PCA is essentially a linear method and cannot capture the non-linear characteristics in the data.In practical problems,we are more interested in the non-linear characteristics of the data.Kernel Principal Component Analysis(KPCA)introduces the kernel trick in PCA,implicitly calculates the dot product of the nonlinear mapping function,and effectively captures the high-order statistical properties of the data in the high-dimensional feature space.Therefore,KPCA is widely used in many professional fields,such as image denoising and face recognition.KPCA needs to calculate the eigen decomposition of the kernel matrix.The time complexity and space complexity of the standard matrix eigen decomposition method are O(n3)and O(n2)respectively.To solve the computational cost problem of large-scale kernel matrix eigen decomposition,a common method is to approximate the kernel learning problem with a linear prediction problem.The most famous methods in this category are the random Fourier feature and the Nystr(?)m method.In particular,Sterge et al.recently proved that the Nystr(?)m sub-sampling technique can reduce the time complexity and space complexity to O(n(log n)4)and O((log n)4),without losing the statistical performance of KPCA.Note that most of the existing KPCA based on Nystr(?)m sub-sampling assumes that the mean value of the kernel function is zero,and the resulting algorithm is not suitable for general kernel functions,such as Gaussian functions.This article will study KPCA(NY-KPCA)based on Nystr(?)m sub-sampling for general kernel functions.In particular,we will establish NY-KPCA's representation theorem from the viewpoint of function approximation,and give the NY-KPCA training(test)error formula.In addition,we studied the impact of K-medoids sampling technology on the accuracy of NY-KPCA.The article uses NY-KPCA and SVM to conduct a number of numerical experiments.The experimental results on synthetic and real dataset prove the prospects of NY-KPCA algorithm.
Keywords/Search Tags:Nystr(?)m, KPCA, NY-KPCA, K-medoids, Reconstruction error, Centralized Kernel
PDF Full Text Request
Related items