Font Size: a A A

Research On Sparse Subspace Clustering Models And Algorithms Based On Low-rank Representation

Posted on:2020-01-01Degree:DoctorType:Dissertation
Country:ChinaCandidate:R L LiangFull Text:PDF
GTID:1368330605970655Subject:Operational Research and Cybernetics
Abstract/Summary:PDF Full Text Request
Clustering analysis is a fundamental issue in machine learning and data mining,which has numerous applications,including pattern recognition,image segmentation,and feature selection.In this age of big data,one frequently encounters large collections of high-dimensional data.How to cluster high dimensional data is a profound issue in data analysis.Thus,the big data calls for subspace clustering,which can effectively cluster high-dimensional data.Subspace clustering emerges as an exciting new technique in clustering analysis.As a fundamental topic in data clustering,subspace clustering techniques are used in fields as diverse as machine learning,computer vision,image processing,system identification,and so on.Compressed sensing is a novel research area in signal processing,which was introduced by D.Donoho,E.Cand`es and Tao in 2006.It was awarded as the top 10 scientific and technological progress in 2007 by United States.Particularly,it greatly promotes the development of sparse theory in both mathematics and engineering applications.Furthermore,low-rank matrix completion is a hot topic in machine learning,computer vision,signal processing,and optimization.Motivated by compressed sensing and matrix completion,subspace clustering based on sparse and low-rank representation attracts much attention.In this paper,we focus on the subspace clustering problems for different practical circumstances.We propose some novel models and algorithms for sparse subspace clustering problems based on low-rank representation.Then,we successfully apply them to image denoising,face clustering,unsupervised learning,and semi-supervised learning.Our major contributions are outlined as follows:1.We propose a novel unsupervised clustering model for a practical situation.In fact,the observed data may be corrupted by both impulsive noise(sparse but large)and Gaussian noise(small but dense).Besides,we also consider the case that only a fraction of entries can be observed.Thus,we first propose a new model based on the sparse and low-rank representation,which segment the data points from incomplete and noisy observations.Then,we establish the first-order optimality conditions for this model.Moreover,we devote to constructing an alternative formulation.And we theoretically establish the equivalence between this formulation and the original model.2.We develop a splitting method for solving the unsupervised clustering model.And,we also establish the global convergence of the proposed method.In detail,motivated by augmented Lagrangian algorithms,we first split the model and analyze each subproblem individually.In particular,for an easy subproblem,we prove its optimal solution can be written in closed-form.However,for a difficult subproblem,it is expensive to solve it directly.Thus,we do not find the exact solution but only obtain an approximate solution quickly.Then,we prove some important properties of the algorithm,which lead to the global convergence.Furthermore,the splitting method is implemented on the clustering problems in synthetic and benchmark data,respectively.Numerical results demonstrate that the proposed method is computationally efficient,robust as well as more accurate compared with the state-of-the-art algorithms.3.We propose two new regularized models for the semi-supervised clustering.In detail,we first formulate two regularization terms,which make full use of a few label information and local geometry information.Then,we propose a regularized model for the semi-supervised subspace clustering problem.Thus,this model incorporates the supervision information,which leads to good clustering performance.In addition,we establish the first-order optimality conditions for this model.Furthermore,we extend this regularized model to a non-negative model,which includes the nonnegative constraint.4.We derive an efficient algorithm for solving the semi-supervised clustering models.And we prove the global convergence of the proposed algorithm.Finally,the proposed algorithm is applied to the semi-supervised clustering problems on synthetic data and several real data.Experimental results show that our method can obtain high classification accuracy and greatly outperforms several state-of-the-art methods.
Keywords/Search Tags:subspace clustering, sparse optimization, splitting method, face clustering, semi-supervised subspace clustering
PDF Full Text Request
Related items