Font Size: a A A

Design And Implementation Of Clustering Ensemble Algorithm Based On Partition Selection And Weighting

Posted on:2022-02-13Degree:MasterType:Thesis
Country:ChinaCandidate:H H HeFull Text:PDF
GTID:2518306509965149Subject:Computer technology
Abstract/Summary:PDF Full Text Request
Clustering ensemble can produce high quality and robust partition results,which solves the defect that a single clustering algorithm can only solve specific problems.Clustering ensemble mainly includes two steps:(1)generation of base partition;(2)representation of information matrix and generation of consistent clustering results.Weighted clustering ensemble and clustering ensemble selection are two methods to further improve the performance of clustering ensemble from different perspectives.Clustering ensemble selection uses different criteria to select high quality base partition results.At present,few work designs the criteria to measure the quality of base partition from the relationship between clusters.The existing weighted clustering ensemble methods usually consider that each cluster in the base partition is equally important and is assigned the same weight.However,it ignores that there are some differences in the sample information contained in different clusters,and it is more reasonable to assign different weights to each cluster.In view of the above two problems,this paper makes further research and improvement,proposes two new algorithms,and makes a perfect experimental comparison with the existing classical algorithms.The main work is as follows:(1)A weighted clustering ensemble algorithm based on cluster compactness is proposed.By calculating the mean variance of each attribute in all samples as the variance of the cluster.The cluster compactness index is defined as the weight to describe the importance of each cluster,and then they are used to weight the traditional co-association matrix(CA)matrix to get the final result.(2)This paper proposes a clustering ensemble selection algorithm based on intra-class scatter and inter cluster scatter.By calculating the intra-class scatter of clusters in the base partition and the inter-class scatter between different clusters in the same base partition,the intra-class and inter-class scatter of each base partition is defined to measure the quality of the base partition.In this paper,by making sure that the intra-class scatter is as small as possible and the inter-class scatter is as large as possible,we can ensure that the intra-class and inter-class scatter of base partition is as small as possible,so as to select high quality base partition results for clustering ensemble.(3)The weighted clustering ensemble analysis system based on MATLAB is designed and implemented.The system realizes the import of data set,the internal adjustment of algorithm,the selection of evaluation index and the display of experimental results.MATLAB GUI technology is used to realize the visual interface of the system.The research content of this paper is the key problem to be solved in clustering ensemble,which plays a role in promoting the research of clustering ensemble.It is explicable in real life and will be one of the hot topics in the future.
Keywords/Search Tags:Clustering ensemble, Intra-class scatter, Inter-class scatter, Intra-class and inter-class scatter, Cluster compactness index, Variance
PDF Full Text Request
Related items