Font Size: a A A

Manifold Learning Based Classification And Clustering Approaches With Their Applications

Posted on:2012-09-26Degree:DoctorType:Dissertation
Country:ChinaCandidate:Y WangFull Text:PDF
GTID:1118330341451716Subject:Mathematics
Abstract/Summary:PDF Full Text Request
With the rapid development of information technology and computer network, thepast few decades have witnessed an exponential explosion in the availability of data frommultiple sources and modalities. This has generated extraordinary advances on how to ef-ficiently and effectively process massive amounts of complex high-dimensional data andextract valuable information, which are the common focus of several researchers fromacademic and applied math, pattern recognition and computer vision. Manifold learningis an effective tool for data processing which can extract meaningful information and low-dimensional representation from the original high-dimensional data, thus it offers a solu-tion to "effectively utilize data and information whenever you like and wherever you are".In this thesis, under the background of pattern classification and cluster analysis, severalextensions to the conventionalmanifoldlearning intheories, methodsandapplications areproposed, which aid us to design novel dimensionality reduction algorithms, discover thehidden intrinsic structure of the data, and address the hybrid manifold clustering problem.More concretely, the main contributions include:1. This thesis extends the corresponding theory of traditional manifold learningwhen the data are high-dimensional and small sampling. Moreover, we discuss onhow to learn under the matrix representation of data.NPE (Neighborhood preserving embedding) often suffers from the singularity prob-lem of eigen-matrix in generalized eigen-analysis when the number of data samples issmaller than the dimension of the data space. Moreover, the eigen-analysis of NPE couldbe unstable and the solutions could be uncomplete. CNPE (Complete Neighborhood pre-serving embedding) which is based on matrix analysis techniques is proposed to over-come these drawbacks and extend the corresponding theory. CNPE does not significantlyincrease the computational complexity and does not loss any useful discriminative infor-mation. However, CNPE works with vectorized representations of data, thus the original2D face image matrices should be previously transformed into same dimensional vectors.Such a matrix-to-vector transform usually leads to a high-dimensional image vector spaceand much spatial structural information may lose after vectorization. Thus, we furtherinvestigate on how to extend CNPE to learn under the matrix representation of data. 2. Undertheassumptionthatthedataarelyingonorclosetomultiplemanifolds,we propose intrinsic structure model and a new supervised linear dimensionality re-duction algorithm called Intrinsic Discriminant Analysis (IDA).For the task of multi-manifold data modeling in pattern classification, we constructa novel mathematical model, called intrinsic structure model. Under this model, eachdata point is divided into three components to characterize a certain intrinsic property andstructural relationship conveyed by this point. Then, we propose a new supervised lineardimensionality reduction algorithm called Intrinsic Discriminant Analysis (IDA). Whenintrinsic structure model is used in face recognition, we obtain intrinsic face model whichis the first face model with a mathematical representation other than conceptual represen-tation. Moreover, we recast the classical PCA and LDA algorithms from the perspectiveof component analysis, and thus extend the comprehension of classical dimensionalityreduction algorithms.3. We carefully analysis the reason why K-flats seriously deteriorates whenfaced with affine subspaces as revealed by previous experiments, then we proposeLocalizedK-flatsalgorithm(LKF)toremoveconfusionamongdifferentclustersandbuild a bridge between linear and nonlinear manifold clustering algorithms.Our analysis reveals that K-flats suffers from three kinds of deterioration, i.e., in-trinsic errors, infinity errors and co-linear errors, which are mainly rooted in the recon-struction error measure and the infinitely extending representations of linear manifolds.Then, we propose Localized K-flats algorithm (abbreviated as LKF), which introduceslocalized representations of linear manifolds and a new distortion measure into the objec-tive function of K-flats to remove confusion among different clusters. Moreover, LKFnaturally has the potential ability to group manifolds with nonlinear structure, thanks toitslocalizedrepresentationsofmanifolds. Therefore, theproposedmethodnotonlybuildsa bridge between model-based and similarity-based linear manifold clustering algorithms,but also builds a bridge between linear and nonlinear manifold clustering algorithms.4. We firstly and definitely propose the general framework of manifold cluster-ing, i.e, the hybrid manifold clustering problem. Moreover, we analysis the difficul-ties of this problem and give some feasible ideas to solve it.As far as we know, we are the first who definitely propose the hybrid manifold clus-tering problem, which advances the study of manifold clustering into a more general framework where the manifolds on which the data points lie are (a) linear and/or non-linear and (b) intersecting and/or not intersecting. This framework covers all the aspectsof existing studies in manifold clustering. Our further analysis on this problem revealsthat its difficulties rely on how to represent the data and their relationships, how to han-dle the intersection regions reliably, i.e., effectively separating the different sides of themanifolds near the intersection into different structures. Moreover, we give some feasibleideastosolvethisproblem: localpropertyandsomenaturalgeometricinformationhiddenin the multi-manifold data can be incorporated to deal with the hybrid manifold clusteringproblem.5. We thoroughly study the potential of spectral methods for hybrid manifoldclustering and propose three effective algorithms from different viewpoints to dealwith the hybrid manifold clustering problem.Inspired by the excellent property and practical successes of spectral clustering algo-rithms, we study the potential of Unsymmetrical Normalized Spectral Clustering (UNSC)and Symmetrical Normalized Spectral Clustering (SNSC) for hybrid manifold clustering.Then, we propose three spectral-based algorithms to the task of detection of multiple hy-bridmanifoldshiddeninthedatawhicharefromthreedifferentviewpoints, i.e., neighbor-hood graph construction, similarity matrix construction and nearest neighbors selection.Thus, we not only propose the hybrid manifold clustering problem, but also effectivelyaddress this problem.In summary, this thesis perfects and advances the theory study of manifold learn-ing, and at the same time uses the results of theory analysis to address several practicalproblems in pattern classification and cluster analysis. Specially, we make several mean-ingful and valuable studies in multi-manifold data modeling and applications which isa challenging mathematical problem and an on-going hot topic in machine learning andartificial intelligence.
Keywords/Search Tags:manifold learning, pattern classification, cluster analysis, multi-manifold modeling, hybrid manifold clustering
PDF Full Text Request
Related items