Font Size: a A A

Research And Implementation Of Clustering Ensemble Algorithm Basing On Voting Strategy

Posted on:2011-01-08Degree:MasterType:Thesis
Country:ChinaCandidate:Q M ChenFull Text:PDF
GTID:2178330338478207Subject:Computer application technology
Abstract/Summary:PDF Full Text Request
Clustering, as one of the hot points of Data Mining Research, became an increasingly hot topic. At present, many clustering algorithms are available, such as k-means algorithm, k-medoids algorithm, BIRCH algorithm, CURE algorithm, DBSCAN algorithm, STING algorithm and so on. Although some of them have been applied widely, it is hard for people to find suitable clustering algorithm for a proper data set, for there are many restricts on those data set from clustering. So, clustering ensemble emerged as a timely require. Experiments showed that through this method we can get better result than single clustering algorithms. But this algorithm is far from mature, such as the enactments of some key parameters, the produce of clustering member and how to design and choose the consensus functions, and so on.Aimed at the shortcoming of the accuracy criterion in clustering algorithms, this paper has generally examined and analyzed the advantages and shortcomings of every clustering algorithms and clustering ensemble techniques and methods. On the basis of above analysis, bring forward clustering ensemble algorithm basing on voting, through adopt k-means clustering algorithm produce member of clustering and realize clustering ensemble through basing on voting, it has been implemented using clustering ensemble algorithm to dispose UCI data sets based on VS2005 experimental platform. Clustering ensemble algorithm and single clustering algorithm were analyzed and compared on veracity, proved validity about this method.In research of the telegraphic company customer subsection, clustering ensemble algorithms were used to dispose the immobile telephone of country and city population of every province in country. The paper adopts clustering ensemble algorithm basing on voting to clustering the telephone people of the province of country via detailed research and analysis about present telegraphic operation. It has been analyzed and compared to other clustering algorithm based on vs2005 experimental platform. Indicated this method is much more better than methods so as to use customer subsection optimally, proved this method is feasible and effective.
Keywords/Search Tags:clustering, clustering ensemble, voting, consensus function, customer subsection
PDF Full Text Request
Related items