Font Size: a A A

Analysis And Research Of K-means Algorithm Based On Genetic Algorithm

Posted on:2010-08-12Degree:MasterType:Thesis
Country:ChinaCandidate:X J SunFull Text:PDF
GTID:2178360275463023Subject:Computer software and theory
Abstract/Summary:PDF Full Text Request
With the development of the information technology in recent years, the economical and social value of information resource is more and more important. By data mining, the valuable and interesting information or knowledge is discovered from numerous data in order to support the science decision. Clustering analysis is a basic assignment of the data mining and an unsupervised classifying method. The goal of clustering is to partition data without any category tags set into such clusters that objects within a cluster have high similarity, but objects between clusters are very dissimilar. Clustering has been used in various ways in commerce, web classification, and image processing and so on. So far, clustering algorithms include the partitioning, hierarchical, model-based, grid-based, density-based algorithm.K-means algorithm is one of the major algorithms of clustering analysis and a kind of clustering algorithm based on partitioning. The algorithm randomly chooses k points as the initial clustering centers; achieve the process of clustering through iteration. It has its own inherent deficiency: it always cannot get the global optimum value because of running into a local optimum easily. The number of k needs to be known before clustering, the task is so difficult for the no experienced users. The choice of the initial center has a great impact on the clustering result. Sometimes K-means algorithm is sensitive to the isolated points and noise.Genetic algorithm is an algorithm of searching the optimal solution by simulating the natural evolution process. It designs a series of process including gene combination, crossover, variation, natural selection to optimize the solution. In these procedures, the bad gene is eliminated through the principle of"survival of the fittest"and the solution develops towards the better direction. Genetic algorithm begins with a group of initial feasible solutions, achieves the global effective searching of the feasible field with the objective function and converges to the global optimal value with the probability 1. It has a distinctive feature of implicit parallelism and effective utilization capacity of the overall information. The kind of nicer characteristic makes the genetic algorithm a useful tool for function and combination optimization. So the effective integration of genetic algorithm and K-means algorithm fully bring into play the global optimized ability of the genetic algorithm and the local searching capability of K-means algorithm in order to improve the clustering quality effectively.Aiming the defects existing in the traditional genetic and clustering algorithms, an improved genetic k-means algorithm is put forward. The algorithm has the following improvements: The self-identify crossover and mutation operators are adopted. The crossover operator speeds up the convergence rate by giving fine pattern heredity of the father generation to next generation. Adaptive mutation operator can expand searching scope and enhance the capability of the algorithm to avoid approaching the best local solution.; optimizing the initial centers of K-means algorithm in order to avoid the random selection; dynamically fixing the value of k according to the fitness function; using a K-means clustering method based on weights to calculate clustering centers in order to reduce the sensitivity of K-means algorithm to isolated points and noise. The experiment uses standard database to test the effectiveness of the improved algorithm, and designs three sets of experimental program to test the improved algorithm and other algorithms, and then explains the experimental results in form of charts and tables, thus draws a conclusion that the improved algorithm is better than other algorithms.
Keywords/Search Tags:Data Mining, K-means Clustering Algorithm, Genetic Algorithm
PDF Full Text Request
Related items