Font Size: a A A

Study Of Chameleon Clustering Algorithm

Posted on:2018-12-12Degree:MasterType:Thesis
Country:ChinaCandidate:H F ChenFull Text:PDF
GTID:2348330533466150Subject:Mathematics
Abstract/Summary:PDF Full Text Request
Clustering problem is an important research topic in the field of data mining. It can not only be used as an independent tool to discover the feature information of data set, but also can be used as preprocessing of other data mining algorithms. Therefore, it is possible to improve the clustering performance of clustering algorithm Has important research significance. Hierarchical clustering algorithm is one of the commonly used clustering algorithms. Hierarchical clustering algorithm usually has the shortcomings of static model selection expectation difference or model not suitable for certain data characteristics. C.hameleon clustering algorithm is a basic aggregation hierarchical clustering algorithm, based on selt-similarity structure dynamic model. The main advantage of Chameleon clustering algorithm is that the algorithm is simple, fast and efficient to deal with large data sets, and the data characteristics are very low. However, the clustering effect of Chameleon clustering algorithm depends on the partitioning effect on KNN graph.In this paper, we focus on the graph partitioning method of the Chameleon clustering algorithm and optimize the partitioning effect of the graph, thus optimizing the final clustering results. And the improved algorithm is used to cluster the UCl data set and the artificial data set. The experimental results show that the clustering effect is superior to the original algorithl and has higher stability.The main results of this paper are as follows:1. Improved Metis algorithm. Metis algorithm based on stochastic natching principle and recursive two-level partition method to complete the KNN graph of the coarsening and initial division steps, easy to make the similarity of the larger points are separated, this paper uses the maximum weight matching principle and the minimum spanning tree method to replace the original method , To achievc the coarsening of the algorithm and the initial division process, the maximum likelihood of similar points to a sub-cluster, to improve the clustering effect.2. Improved Chameleon algorithm. Chameleon algorithm is the use of Metis algorithm to complete the divisiol of the map, and therefore, clustering results also with the sub-cluster division of the effect of change,at the same time, in the drawing division process, the previous layer of the local optimal division is not necessarily the next layer Of the local optimal division. In this paper, the improved K-means clustering algorithm and DP clustering algorithm are used to graph the partition, which can improve the stability of the clustering results and maintain the high cohesion of the cluster, and the number of iterations is obviously reduced. To a certain extent, also reduces the time complexity.Using the UCI data set and the two-dimensional artificial data set, the improved algorithm of this paper is studied empirically. The results show that the improved algorithm has better effect in clustering accuracy and operation efficiency.
Keywords/Search Tags:Chameleon algorithm, K-mean algorithm, Metis algorithm, DP algorithm, Minimum spanning tree
PDF Full Text Request
Related items