Font Size: a A A

Research On Co-association Matrix Based Clustering Ensemble Algorithm

Posted on:2021-03-20Degree:MasterType:Thesis
Country:ChinaCandidate:T WangFull Text:PDF
GTID:2428330626955422Subject:Software engineering
Abstract/Summary:PDF Full Text Request
Cluster analysis,as one of the important research directions in the field of data mining,has received widespread attention from researchers.In recent years,many effective clustering algorithms have been proposed and show good performance in data clustering,but it is difficult for a single clustering algorithm to adapt to complex structured data.In order to solve this problem,clustering ensemble was proposed and developed rapidly.The goal of clustering ensemble is to improve the stability,robustness and accuracy of the clustering algorithm by integrating multiple base partition results.Among the many clustering ensemble methods,clustering ensemble based on the co-association matrix is an important research direction and one of the research hot spots in this field.Therefore,this paper selects the co-association matrix based clustering ensemble as the research object.The main research results are as follows:(1)Sample pairwise weighting co-association matrix based clustering ensemble algorithm is proposed.This algorithm uses the k-means algorithm to generate multiple base partition results,and then uses the k-means algorithm to generate multiple sample cluster results for each class in the base partition.The importance of the pair of samples in the co-association matrix is evaluated by calculating the change degree of uncertainty of the class after removing the subclass of the pairwise sample.And sample pairwise weighting co-association matrix based clustering ensemble algorithm is realized.The experimental results show the effectiveness of the proposed algorithm.(2)A metric learning based clustering ensemble algorithm is proposed.This algorithm uses a co-association matrix to construct the set of musk-link constraints and the set of cannot-link constraints between sample pairs,gives a corresponding metric learning algorithm,then uses the k-means algorithm to generate new base partition results based on the learned metrics,and uses the base partition to construct a newco-association matrix.By looping the above process,the construction of the co-association matrix and the generation of base partition are mutually guided and optimized.In the end,this algorithm outputs a high-quality cluster result.Experimental results show the effectiveness of the proposed algorithm.
Keywords/Search Tags:Clustering, Clustering ensemble, Co-association matrix, Information entropy, Metric learning
PDF Full Text Request
Related items