Image clustering is the process of dividing an image dataset into several classes based on certain criteria,with the goal of having high similarity within the same category and low similarity between different categories.Image data are typically stored in high-dimensional,non-negative formats.Non-negative matrix factorization(NMF)is an effective clustering method that can decompose a non-negative matrix into two non-negative submatrices,and can effectively reduce the dimensionality of data.However,there is currently no single clustering method that can be applied to all data,and ensemble clustering is the best way to solve this problem.Ensemble clustering combines the results of multiple basic clustering algorithms,resulting in a consistent clustering result that is significantly better than each individual clustering algorithm.This paper applies ensemble clustering knowledge to NMF and proposes three different methods to improve clustering performance.Firstly,in clustering experiments,the number of clusters needs to be specified.Researchers usually use the true number of clusters for experiments,but it is difficult to obtain the true number of clusters in practical applications.In addition,the clustering results produced by different basic clustering algorithms are not the same,and the impact on the final consistent clustering result is also different.Therefore,a hierarchical pre-processing NMF weighted ensemble clustering is proposed to solve these two problems.Secondly,ensemble clustering can be understood as a mathematical optimization problem,and genetic algorithms are one of the standard models for solving optimization problems.Furthermore,the data in practical applications contains some prior information,such as the cluster labels of some data or the relative relationships between some data.Therefore,a semi-supervised ensemble clustering based on a genetic algorithm model is proposed,using pairwise constraint information to achieve the crossover and mutation process.Thirdly,ensemble clustering can be composed of multiple identical or different basic clustering algorithms,or it can be composed of partial subspaces of these basic clustering algorithms,called subspace clustering ensemble.The NMF algorithm can decompose the data matrix into a feature matrix and a coefficient matrix,and the basic matrix can be viewed as the feature subspace of the data matrix.Therefore,a subspace ensemble clustering based on NMF is proposed,and prior information is used as additional constraints,called soft subspace ensemble clustering.By iterating and solving the above three methods,and conducting corresponding numerical experiments,the effectiveness of the improved algorithms is demonstrated through experimental results. |