Research Of K-Means Clustering In Data Mining Based On Genetic Algorithm

Posted on:2010-07-21

Degree:Master

Type:Thesis

Country:China

Candidate:Y L Zhao

Full Text:PDF

GTID:2178360275462180

Subject:Control theory and control engineering

Abstract/Summary:

PDF Full Text Request

Data mining is a new subject formed with the development of the information technology and is a new research point in the information and database technology. The purpose of data mining is to discovery hidden and useful knowledge which can support the science decision from huge amounts of data.Cluster analysis is one of the important themes in data mining. Clustering is an unsupervised classifying method, the goal of clustering is to partition data set into such clusters that objects within a cluster have high similarity in comparison to one another, but are very dissimilar to objects in other clusters without any prior knowledge. As a classical method of clustering analysis, k-means has been widely used in commerce, market analysis, biology, text classification and so on. However k-means has two severe defects—sensitive to initial data and easy to get into a local optimum. On this condition, combining the idea of genetic algorithm, a hybrid algorithm of clustering which is based on genetic algorithm and k-means algorithm is proposed, the performance of the hybrid algorithm is tested.The main research work of the paper includes:Firstly, clustering analysis technology is introduced in details, most existing clustering algorithms are classified, and their advantages and disadvantages are analyzed. On this basis, k-means method is choosen as research target.Secondly, an important method—genetic algorithm in data mining is introduced, and the characteristic, basic element, applied flow of it are described in details.Thirdly, based on the characteristics of genetic algorithm and k-means method, a new clustering method of k-means based on improved genetic algorithm is proposed. The proposed algorithm is described in details from coding method, fitness function, selection operators, crossover operators, mutation operators, k-means operators and other aspects.Finally, for testing the performance of the proposed algorithm, the paper gives three simulation experiments. Simulation results show that comparing with k-means method, the proposed algorithm can get a better clustering result.

Keywords/Search Tags:

Data mining, Cluster analysis, Genetic algorithm, k-means, IGKA

PDF Full Text Request

Related items

1	Optimized K-Means Clustering Analysis Based On Genetic Algorithm
2	Research On Parallel K-means Algorithm Based On Genetic Algorithm
3	Data Mining Technology And Its Application In The Supermarket In Crm
4	Research In Data Mining Method Based On Genetic Algorithms
5	Research On Cluster Analysis Based On Optimized Genetic Algorithm
6	The Reaserch Of Clustering Techlogies In Data Mining
7	Research And Application Of K-means Algorithm In Data Mining Technology Based On Genetic Algorithm
8	Research In Data Mining Based On Genetic Algorithms
9	Research On K-Means Algorithm And Its Integration With Intelligent Algorithms
10	Research And Application Of K-means Clustering Algorithm