Research And Application Of K-means Algorithm In Data Mining Technology Based On Genetic Algorithm

Posted on:2015-05-19

Degree:Master

Type:Thesis

Country:China

Candidate:S Zhao

Full Text:PDF

GTID:2298330467951350

Subject:Control theory and control engineering

Abstract/Summary:

With the rapid development of information technology and database technology, the data of peopleâ€™s daily shows explosive growth. Data mining, as a technology of extracting useful information from large data sets, helps people make scientific decisions based on the data. Cluster analysis is a fundamental analysis of data mining, which is an unsupervised classification method. Through cluster analysis, we can divide the large amounts of data into different clusters without any prior knowledge, and make the inter-cluster objections be very similar; meanwhile, the outer-cluster object is not similar. Then we can discover interesting patterns among the large amounts of data.In data mining, clustering analysis is a common method, while K-means algorithm is the most popular clustering algorithm based on division. The disadvantage of this algorithm is that it is easily influenced by the initial cluster centers, and will have a premature convergence to local optimal solutions. To solve this problem, we propose a K-means clustering algorithm based on adaptive genetic algorithm and a K-means clustering algorithm based on DNA genetic algorithm by using the advantages of global optimization of genetic algorithm and verified the validity of the algorithm through the sample data. Meanwhile, we have applied the improved algorithm into customer behavior segmentation of the china mobile company, and achieved good results.The main work is as follows:1) Combining the advantages of both genetic algorithm and K-means clustering algorithm, this paper has designed an K-means clustering algorithm based on adaptive genetic algorithm. The improved algorithm gets the optimal initial center by using the ability of global optimization of genetic algorithm; and then, it uses K-means clustering algorithm to cluster and gets optimal clustering results.2) For the problem of K-means clustering algorithm influenced by the initial cluster centers, this paper has proposed K-means clustering algorithm based on DNA genetic algorithm. It adopts DNA coding. Besides, two crossover operators are designed.The diversity of population can impoved, which avoids the premature convergence effectively. Meanwhile, this paper has proposed a new multi-step evolution strategy. It enhances the global search capability of the algorithm. Finally, Rosenbrock test function is used to verify the validity of the algorithm, and then use the sample data to verify the accuracy of clustering results.

Keywords/Search Tags:

Data mining, clustering analysis, genetic algorithms, K-means

Related items

1	Research On Modern Optimization Algorithm Based K-Means Clustering And Its Applications On Student Grade Mining
2	Teaching Analysis Based On The Clustering Rule Mining System Design And Implementation
3	Data Mining Algorithms And Applications
4	Research Of K-Means Clustering In Data Mining Based On Genetic Algorithm
5	Clustering Analysis Of K-means Based On Improved Genetic Algorithm
6	Study Of K-Means Clustering Based On Genetic Algorithm
7	Study Of K-means Clustering Based On Genetic Algorithm
8	Research On Improved Clustering Algorithms And Its Application In The Analysis Of Achievement
9	Study And Application Of CRM Data Mining Based On Clustering Algorithms
10	Research On Parallel Optimization Of Clustering Algorithms In Data Mining