With the advent of the era of big data, the demand of obtaining the valuable information from huge amounts of data increasing daily, itâ€™s urgent to find some of the new methods to deal with the mass data. Cluster analysis as an important part of data mining, has important significance to the development of data mining technology. Clustering analysis can not only deal with the data sets separately, obtain the wanted distributed situation of data, and also serve as the data pretreatment process of the other data mining methods. In view of the traditional methods inefficient in solving the existing problems, and in order to normalize and process the huge amounts of data better, find valuable information hidden in the data sets, and in order to be more comprehensively, more efficiently meet the requirements of practical application, the need to conduct the thorough research to the related clustering method is urgent.K-means clustering algorithm is a classical clustering method which thinking simple, easy to implement, and converges quickly, the main drawback of the algorithm is it necessary to give a clear clustering number and the initial clustering center when the method is initialized. Swarm intelligence algorithm is a kind of optimization search algorithms which simulate the biological behavior of the population. The genetic algorithm and the ant algorithm are the representative algorithms. Genetic algorithm is a method which search whole space and get the next generation by genetic manipulation. So it can expand the search scope, increase the diversity of the solution and avoid the local optimization of convergence. Ant algorithm has strong adaptability, can deal with multi-type of data, can find the optimal solution, and can be combined with other intelligent algorithms or clustering algorithm into efficient, new combination algorithm, etc.This article is mainly to carry on the related study and research of the clustering algorithm and the intelligent algorithm. First of all, introduce related contents in the clustering analysis, analyze the current requirement for clustering, evaluation criteria of clustering results, and some classical clustering methods in detail, and then introduce the related concept of swarm intelligence, mainly analyze the principle of the genetic algorithm and ant algorithm and, and analyze their respective advantages and disadvantages, and the related application in clustering algorithm. Based on ant clustering algorithm convergence speed slow at early stage, appear premature phenomenon easily at later stage, the study found that the K Means clustering algorithm has the advantage of astringe rapidly and the ant algorithm can obtain the optimal solution, so the combination of two algorithm form a new type of combination algorithm. But the combined algorithm has not improved ant clustering algorithm easy to precocity problem in the late stage, through the study, found use the mutation operator in the genetic algorithm to later iterations, can increase the scope of solution set, import the mutation operator operation of the genetic algorithm. By using data of UCI data sets, and the comparing experiment of the ant clustering algorithm and the original K-Means ant clustering algorithm verified the improvement of the new algorithm can effectively improve the convergence problem of local optimum, and retain the advantages of the original algorithm can accelerate the convergence speed. At the same time, for K-Means clustering algorithm given initial cluster number value and initialize clustering center randomly result in the disadvantages of clustering results is large fluctuation, this paper combine division and condensation of hierarchical clustering algorithm, proposed an hierarchical K-means cluster method based on the level of the minimum spanning tree, and through the simulation experiment prove the effectiveness of the algorithm. |