Font Size: a A A

Research On Selective Clustering Ensemble Based On Cluster Validity Index

Posted on:2021-03-14Degree:MasterType:Thesis
Country:ChinaCandidate:J J MaFull Text:PDF
GTID:2438330620962958Subject:Computer application technology
Abstract/Summary:PDF Full Text Request
Clustering ensemble is an extremely important method in the field of Machine Learning and Data Mining.It integrates multiple clustering results with differences in clustering groups by designing a consensus function,thereby improving the quality of clustering results.Although clustering ensemble can improve the learning accuracy,there are still many difficulties in it.For example how to select the initial clustering results with higher individual accuracy and larger differences.The essence of selective clustering ensemble is to select a part of initial clustering results with large differences and high accuracy for ensemble,thereby improving the quality of clustering ensemble results.The problems solved by clustering ensemble are mainly concentrated in the following two aspects: the one is to make the initial clustering results,that is,the cluster members present diversity;the another one is to select a suitable fusion method,and then integrate these cluster members to obtain the final clustering result.In addition to studying the two key problems of clustering ensemble,selective clustering also needs to focus on how to design a suitable selection strategy,and then select some suitable clustering members from the diversity clustering results generated for clustering Class ensemble to improve the quality of clustering results.Due to,the cluster validity index can be used to measure the goodness of the clustering results,this paper uses the feature to filter the base cluster membership set,and then proposes a selective clustering ensemble algorithm based on cluster validity index.The specific work of this paper mainly focuses on the selection strategy of selective clustering ensemble,which is divided into the following two aspects: First,a research on selective clustering ensemble algorithm based on a single clustering effectiveness index is proposed,which uses three classical clustering effectiveness indexes to measure the effectiveness of the base clustering results.NMI(Normalized Mutual Information)is used to select the base clustering results with better effect,and then the CSPA algorithm is used to integrate,thereby improving the quality of the clustering results.Second,because the fact that a singleclustering validity index is only applicable to a specific distribution of data sets,this paper further proposes a selective clustering ensemble algorithm based on multiple clustering effectiveness indexes.The algorithm synthesizes three clustering validity indexes to evaluate the initial base clustering results,and then selects the base clustering results of a certain section to integrate using the CSPA method.This paper conducts a comparative scientific experiment on the proposed algorithm on five artificial data sets.The experimental results show that the algorithm of this paper improves the accuracy of clustering ensemble results.
Keywords/Search Tags:Clustering ensemble, Selective clustering ensemble, Clustering effectiveness index, Selection Strategy
PDF Full Text Request
Related items