Font Size: a A A

Research Of Fuzzy C-Means Algorithm Based On Ant Colony Clustering With Information Entropy

Posted on:2011-01-08Degree:MasterType:Thesis
Country:ChinaCandidate:Q M RongFull Text:PDF
GTID:2178330338479134Subject:Computer application technology
Abstract/Summary:PDF Full Text Request
Clustering is the one of important branches of data mining. In recent years,with the depth of data mining,it brought about a substantial number of new clustering algorithms, each clustering algorithm is targeted different applications and have its advantages and disadvantages. In many clustering algorithms, Fuzzy C-means clustering algorithm based on objective function is the most widely used. However, the algorithm there are still some shortcomings, such as its clustering results are easily influenced by the initialization, the iteration is easy to fall into local minima and so on. In this paper, fuzzy C-means algorithm has been improved from three perspectives:(1)fuzzy C-means algorithm and ant colony algorithm based on information entropy used in combination. First the implementation of the ant colony algorithm based on information entropy get the number of clusters C and the cluster centers. Then fuzzy C-means algorithm use these initial parameters to execute the iterative optimization, not only it can overcome the defects that fuzzy C-means algorithm is very sensitive to initialization, but also avoid falling into local minima. The innovations is that the algorithm introduced ant colony algorithm based on information entropy rather than the standard ant colony algorithm. This improvement changed the rules of ants pick up or lay down, through comparing the change of information entropy, not probability, so that it can reduce the randomness of the algorithm implementation.(2)Put forward the idea of attribute weights. Given the practical application showed that for data sets with multidimensional attributes, often some attributes to the data clustering is obvious, but some properties are less affected. As attributes'distribution perform an essential role in data clustering, each attribute is given a weight in this paper. Then based on this idea, we derivate the formulas of the objective function, cluster centers and membership functions of the iteration in fuzzy C-means algorithm.(3)This paper use the Voronoi distance substitute Euclidean distance method to adjust the formula for calculating the degree of membership, in order to overcome the defects that fuzzy C-means algorithm's clustering is easy to interfere for noise data. It can globular clusters by using Euclidean distance, but can form honeycomb-shaped clusters using Voronoi distance, being more realistic results. Membership formula to use Voronoi distance can adjust the value of membership, so the value of membership of large data objects on the center of cluster increases, the membership of small data objects (noise data) decreased for their accumulation center of class. Thus it reduced the interference of the isolation point for the clustering results.Experimental results show that the improved algorithm performances a good results whether in time or the accuracy of clustering results and stability.
Keywords/Search Tags:Clustering algorithm, Fuzzy C-means algorithm, Information entropy, Ant colony algorithm, Voronoi distance
PDF Full Text Request
Related items