| With the rapid development of network technology,the research on image data is more and more important.The traditional technology usually transforms the image from matrix form to vector form,which greatly improves the data dimension,turns it into high-dimensional data,and then completes the data mining tasks such as clustering or classification by building the data mining model.However,due to the complexity of high-dimensional data,the application of traditional features methods will result in large error results.Then we need to construct a data model which can more accurately represent the complex features.The subspace clustering method has a good effect on the clustering task of nonlinear separable data generated by the union of(affine)subspaces in high dimensional Euclidean space.Sparse subspace clustering(SSC)and low rank representation(LRR)are both subspace representation methods that is obtained the sparse or low rank coefficient representation matrix of data by subspace approximation of high-dimensional data.Based on the matrix,the undirected graph of the connection weight of the original data is obtained,and then the graph cut method is used to cluster it.Both of them are subspace representation methods based on vectors.However,in the real situation,the transformation of matrix into vector will greatly destroy the spatial geometric information of image data itself and increase the dimension of dataset itself,which will result in a large number of sample misclassification due to the damage of spatial geometric information of dataset.In the framework of sparse representation,this paper studies the subspace clustering algorithm based on matrix form.The main work is as follows.First of all,combining the subspace representation method of dataset,this paper constructs a two-dimensional sparse subspace clustering model(2DSSC)by using the sparse constraint constructed by 1 norm to approach the subspace structure of the feature matrix after reconstructing the feature space of the image matrix data.Among them,the spatial geometric information of the image matrix data has not been destroyed,unlike the traditional subspace clustering method,it has the advantage that it does not need to convert the image matrix data set into vector,the spatial geometric information between the pixels of the image is effectively protected that the subsequent linear representation error is effectively reduced.And the cancellation of this transformation process will not greatly increase the dimension of the data,and reduce the computational complexity,making 2DSSC model more accurate than the existing subspace clustering model to get the subspace approximation results of the dataset.On the other hand,after getting the feature matrix of the matrix dataset,compared with the traditional image dataset clustering method based on the Euclidean metric representation,the 2DSSC model uses the linear relationship between the pixels of the image data to get the similar relationship matrix between the pixel rows or columns,which greatly reduces the representation error.Moreover,through a large number of comparative tests,it is also confirmed that the clustering accuracy of 2DSSC method proposed in this paper is higher than the existing subspace clustering method based on vector and the Euclidean distance representation method based on matrix(two-dimensional feature selection).Based on 2DSSC method,this paper studies the problem of missing content image dataset completion and clustering,and proposes a repeated and clustering methods for missing content images(2DSRMC).This method is mainly based on the sparse representation linear relationship of data feature space to complete the missing items of the missing content image data matrix.At the same time,the sparse representation coefficient matrix is graph cut to get the clustering results,that is,through the linear representation relationship between the pixels of data points,the sparse similarity matrix between incomplete data points is obtained in the process of algorithm iteration and the recovery of incomplete dataset is achieved.The model can also achieve effective recovery for datasets with high loss rate.Finally,a large number of dataset recovery experiments with different degrees of missing are used to verify the good recovery and clustering effect of 2DSRMC method on missing content image datasets.The parameter analysis of 2DSRMC model is given. |