Cluster analysis as an effective tool of data mining techniques which has becomewidely attention in recent several decades and has also gained a wide range of applicationand research in data mining, image segment, pattern recognition, information retrieval,computer vision and so on. K-means algorithm is a traditional partition clustering method.It is widely used in the area of Data Mining to cluster large data sets due to its highefficiency. With the depth of Data Mining technology, a variety of intelligent optimizationalgorithm get a better application in the K-means clustering. The shuffled frog leapingalgorithm (SFLA), a new member in the family of swarm intelligence algorithms, whichhas a better results in many fields and become one of the frontier and hot field in artificialintelligence. According to the K-means algorithm that dependency on the initial state andconvergence to local optima problem, this thesis will proposed K-means clusteringalgorithm based on the modified shuffled frog-leaping algorithm(MSFLA). The maincontributions can be listed as follows:1、The concept、process、similarity measure、criterion function and classification ofthe clustering algorithm is introduced as well as the shortcomings of the K-meansalgorithm is analyzed.The basic framework、functional principles and features of theSFLA is introduced and its advantages and disadvantages is also analyzed.2、The modified shuffled frog leaping algorithm is proposed. Through introducing ainertia weight coefficient of distance to adjust the moving distance, make the part of themoving distance of last update shows linearly decline in the iteration process, whichimproves the algorithm optimization.3、Applying the modified shuffled frog leaping algorithm into the K-means Algorithmand the K-means Algorithm Based on the MSFLA is presented. It combines the benefits ofthe modified shuffled frog-leaping algorithm as well as K-means algorithm. Consequently,this method will overcome the initialization sensitive issue as well as the issue of easy to fallinto local extreme which owned by the traditional K-means clustering algorithm, and theperformance of the algorithm is also improved. Through testing the date sets of the Iris、 Zoo、Crude oil and Thyroid diseases,and comparing K-means algorithm Based on theMSFLA with other heuristics algorithm in K-means clustering, our finding shows that theproposed algorithm is effcitive.4、The K-means algorithm based on the modified shuffled frog leaping algorithm is usedin the classification of the voltage control area (VCA) and the level of information inwestern region. The results show that the algorithm has a good prospect in the clusteranalysis.In conclusion, this paper makes a summarization and outlook for the prospect of theK-means Algorithm Based on the MSFLA. |