Font Size: a A A

Kernel-based clustering and low rank approximation

Posted on:2009-07-03Degree:Ph.DType:Thesis
University:Hong Kong University of Science and Technology (Hong Kong)Candidate:Zhang, KaiFull Text:PDF
GTID:2448390002499282Subject:Computer Science
Abstract/Summary:
Clustering is an unsupervised data exploration scenario that is of fundamental importance to pattern recognition and machine learning. This thesis involves two types of clustering paradigms, the mixture models and graph-based clustering methods, with the primary focus on how to improve the scaling behavior of related algorithms for large-scale application. With regard to mixture models, we are interested in reducing the model complexity in terms of number of components. We propose a unified algorithm to simultaneously solve "model simplification" and "component clustering", and apply it with success in a number of learning algorithms using mixture models, such as density based clustering and SVM testing. For graph-based clustering, we propose the density weighted Nystrom method for solving large scale eigenvalue problems, which demonstrates encouraging performance in the normalized-cut and kernel principal component analysis. We further extend this to the low rank approximation of kernel matrices, which is the key component to scaling up the kernel machines. We provide an error analysis on the Nystrom low rank approximation, based on which a new sampling scheme is proposed. Our scheme is very efficient and numerically outperforms a number of state-of-the-art approaches such as incomplete Cholesky decomposition, the standard Nystrom method, and probabilistic sampling approaches.
Keywords/Search Tags:Clustering, Low rank, Kernel
Related items