Font Size: a A A

Research On Subspace Clustering Algorithm Based On Representation

Posted on:2022-05-22Degree:MasterType:Thesis
Country:ChinaCandidate:X C JingFull Text:PDF
GTID:2518306521496774Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
As a representative of unsupervised learning algorithms,clustering has always been an important field of research.With the development of information technology,data has become more complex.Traditional clustering algorithms can no longer be suited to high-dimensional data and ultra-high-dimensional data.Therefore,the subspace clustering algorithm based on spectral clustering came into existence.However,the subspace clustering algorithm still has many problems.Based on these problems,the main research work of this paper is as follows:In order to reduce the time cost,dimensionality reduction are essential to perform preprocessing of the input image data.Multilinear Principal Component Analysis(MPCA)is a traditional dimensionality reduction algorithm.However,PCA focuses on two-dimensional matrices,so the image must be compressed before dimensionality reduction.However,this will destroy the spatial structure of the images,which is unfavorable for the subsequent clustering.Therefore,we introduce the Multilinear Principal Component Analysis(MPCA)algorithm,which uses tensor calculations to reduce the dimensionality of the image without compressed.In the experiment,the two algorithms are compared on different image data sets.The experimental results show that MPCA as dimensionality reduction is more conducive to subspace clustering.As the volume of data increases,traditional subspace clustering always takes a much longer time to get results.For the sake of improving efficiency,we present a unified framework based on information transfer for sparse subspace clustering(SSC).This framework is composed of two stages.First of all,a small number of data points are selected by sampling strategy.Then they are analyzed and calculated as a representation coefficient matrix by traditional methods.During the second stage,the remaining data are analyzed and calculated as another representation coefficient matrix by information transfer,which has higher efficiency.The two parts are integrated and used in spectral clustering for clustering results.Moreover,the proposed framework is flexible and scalable.It can choose different sampling methods and extend to other subspace clustering algorithms.Experimental results on COIL-20,Yale BCrop025 and COIL100 database confirm that the new framework not only raises efficiency but also could ensure a certain degree of accuracy.The traditional subspace clustering algorithms construct a single model,which can be affected by noise more easily.It hardly balances the sparsity and connectivity of the representation coefficient matrix.Therefore,we proposed a post-process strategy of subspace clustering for taking account of sparsity and connectivity.First,we define close neighbors as having more common neighbors and higher coefficients neighbors,where the close neighbors are selected according to the non-dominated sorting algorithm.Second,the coefficients between the sample and close neighbors are reserved,incorrect or useless connections are pruned.Then post-process strategy can reserve the intra-subspace connection and prune the inter-subspace connection.In experiments,we verified the universality and effectiveness of post-processing strategies in the traditional image recognition field and the Io T field respectively.The experiment results demonstrate that the proposed strategy can process noise data in the Io T to improve clustering accuracy.
Keywords/Search Tags:Sparse subspace clustering, Information transfer, Sampling, Close neighbors, Dimensionality reduction
PDF Full Text Request
Related items