Kernel-based clustering and low rank approximation

Posted on:2009-07-03

Degree:Ph.D

Type:Thesis

University:Hong Kong University of Science and Technology (Hong Kong)

Candidate:Zhang, Kai

Full Text:PDF

GTID:2448390002499282

Subject:Computer Science

Abstract/Summary:

Clustering is an unsupervised data exploration scenario that is of fundamental importance to pattern recognition and machine learning. This thesis involves two types of clustering paradigms, the mixture models and graph-based clustering methods, with the primary focus on how to improve the scaling behavior of related algorithms for large-scale application. With regard to mixture models, we are interested in reducing the model complexity in terms of number of components. We propose a unified algorithm to simultaneously solve "model simplification" and "component clustering", and apply it with success in a number of learning algorithms using mixture models, such as density based clustering and SVM testing. For graph-based clustering, we propose the density weighted Nystrom method for solving large scale eigenvalue problems, which demonstrates encouraging performance in the normalized-cut and kernel principal component analysis. We further extend this to the low rank approximation of kernel matrices, which is the key component to scaling up the kernel machines. We provide an error analysis on the Nystrom low rank approximation, based on which a new sampling scheme is proposed. Our scheme is very efficient and numerically outperforms a number of state-of-the-art approaches such as incomplete Cholesky decomposition, the standard Nystrom method, and probabilistic sampling approaches.

Keywords/Search Tags:

Clustering, Low rank, Kernel

Related items

1	Research On Subspace Clustering Model And Algorithm
2	Kernel Structure Constrained Low-rank Representation
3	Research On Text Clustering Problems Of Kernel Function And Self-definite Category Number
4	Research On Kernel--based Hierarchical Clustering Algorithm
5	Adaptive Clustering Method Based On The Rank Constraint
6	Research On Robust Subspace Clustering Method Based On Low Rank Representation
7	Researching The Kernel Clustering Algorithm And Its Application In Text Clustering
8	Research On Imputing Algorithm Of Missing Values Based On Kernel Similarity And Low Rank Approximation
9	The Study Of Kernel Methods In Classification, Regression And Clustering With Applications
10	Research On Fuzzy Kernel Clustering Algorithms Based On DC Programming