Font Size: a A A

Subspace Clustering Research And Application Based On Truncated Schatten-p Norm And Self-representation

Posted on:2022-01-25Degree:MasterType:Thesis
Country:ChinaCandidate:Y C YangFull Text:PDF
GTID:2518306533977449Subject:Computer technology
Abstract/Summary:PDF Full Text Request
Subspace clustering,as a kind of high-dimensional data clustering method,refers to mapping data from high-dimensional space to multiple low-dimensional subspaces and clustering in the low-dimensional subspaces.At present,it has been widely used in data mining and computer vision.At present,the research methods of subspace clustering are mainly divided into two categories: traditional machine learning method and deep learning method.Low-rank subspace clustering method is an important branch of traditional subspace clustering methods.It mainly uses the low-rank property of the matrix to construct the similarity matrix and obtain clustering results.However,most low-rank subspace clustering methods ignore the long-tail distribution of matrix singular values and the contribution of small singular values to the calculation of matrix rank.At the same time,the deep subspace clustering method mainly learns the low dimensional representation of data through the autoencoder,and constructs the similarity matrix with the low dimensional representation.However,most of the deep subspace clustering methods only consider the distribution characteristics of lowdimensional representation of data,and do not consider the discrete distribution characteristics of data categories.Based on the above points,this paper proposes the following three improvements to the traditional subspace clustering method and the deep subspace clustering method:(1)Aiming at the traditional low-rank subspace clustering method,this paper proposes a low-rank subspace clustering method based on truncated schatten-p norm,which not only fitted the long-tail distribution characteristics of the singular values of the matrix,but also highlighted the contribution of small singular values to the calculation of the matrix rank function,so as to further improve the clustering accuracy.(2)For the deep subspace clustering method,this paper proposes a selfrepresentation subspace clustering model based on adversarial autoencoder,which uses the autoencoder to simultaneously learn the low-dimensional representation and class representation of data,and confronts the two representations with the prior normal distribution and the prior multinomial distribution respectively.At the same time,by introducing the self-representation layer,the similarity matrix is constructed by using the low-dimensional representation of data to obtain the self-representation of the lowdimensional representation of data and improve the clustering effect.(3)In view of the problem that deep subspace clustering method needs a priori knowledge of data categories distribution,this paper proposes a self-representation subspace clustering model based on mixed latent variables.The low-dimensional representation of the data is mapped to the standard normal distribution by continuous sampling method,and the category representation of the data is mapped to the multinomial distribution by gumbel-sotfmax distribution sampling method.At the same time,by introducing the self-representation layer,the similarity matrix is constructed by using the low-dimensional representation of data,and the self-representation of lowdimensional representation of data is obtained.This model reduces human intervention and automatically learns the discrete distribution of data categories,which further improves the clustering effect.Experimental results on standard image and video data sets show that the proposed low rank subspace clustering model based on truncated schatten-p norm can better fit the rank function of the matrix and construct a better data similarity matrix.The self representation subspace clustering model based on adversarial autoencoder improves the average clustering accuracy by learning the characteristics of data class distribution.The self representation subspace clustering model based on mixed latent variables reduces the manual intervention in the process of data class distribution fitting which further improves the clustering effect.There are 53 figures,7 tables and 97 references in this paper.
Keywords/Search Tags:subspace clustering, truncated schatten-p norm, adversarial autoencoder, mixed latent variables
PDF Full Text Request
Related items