| Clustering is an important unsupervised machine-learning method that can be applied to many fields,such as statistics,pattern recognition,biology,and data mining.Traditional data clustering methods face challenges,and it is not easy to meet the needs of efficiency,stability,and robustness.The clustering ensemble combines the results of individual clustering algorithms,which can obtain better results than traditional clustering methods.A clustering ensemble can produce more stable and reliable clustering results,which helps interpret and analyze the data distribution.At the same time,a clustering ensemble can provide more comprehensive and diverse clustering results and help users deeply understand the structure and characteristics of the data set.A single clustering algorithm must often deal with large-scale and complex data sets better.A clustering ensemble can combine the results of multiple clustering algorithms with improving the scalability and adaptability of clustering.Most research on clustering ensembles mainly focuses on designing effective ensemble algorithms,ignoring the negative impact caused by poor-quality base clusters.At the same time,most methods regard the clusters in the base cluster as a whole,ignoring the importance of the correlation ranking of samples in the same class.In this paper,considering the quality and diversity of base clusters,we explore the sample structure from a local and global perspective and achieve the following results.(1)A clustering ensemble algorithm based on high-order consistency learning is proposed to represent the relationship between data from different dimensions.Through ablation experiments,the quality and characteristics of each high-order information are analyzed and compared,and the effectiveness of high-order information fusion is verified.In order to solve the problem of complex structure and high fusion cost of high-order information,this paper proposed a new algorithm framework,which integrated the processing of multiple high-order information into a unified framework and finally fused multiple information into a consistent result.The experimental results show that compared with the suboptimal LWEA algorithm,the accuracy of the proposed algorithm is increased by 7.21%,and the Normalized Mutual Information(NMI)is increased by 7.34%.Compared with the clustering ensemble algorithm and using only one piece of information,the proposed algorithm obtains better clustering results.(2)A local sample weighted clustering ensemble algorithm based on high-order graph diffusion is proposed,which implicitly optimizes the adaptive weights of different neighborhoods according to the ranking importance of different neighborhoods.The optimal consensus matrix with stronger discrimination ability is obtained through further diffusion of the consensus matrix,which reveals the potential similarity relationship between samples.Experiments were carried out on benchmark data sets and comparison methods.All empirical results show that our clustering model consistently outperforms related clustering methods.(3)A clustering ensemble system is designed and implemented based on higherorder consistency learning.With the development of clustering ensemble technology,more and more algorithms have been proposed,which makes many researchers tired of reproducing the paper’s code instead of focusing on the algorithm itself.This system encapsulates some commonly used and effective clustering ensemble algorithms and the algorithm proposed in this paper into the clustering ensemble system.Help researchers solve this problem.In summary,aiming at the existing problems in clustering ensemble,this paper proposes a clustering ensemble method based on high-order consistency learning and a local sample weighted clustering ensemble algorithm based on high-order graph diffusion,and applies them to the clustering ensemble system. |