Font Size: a A A

Clustering Algorithm Based On Rough Sets And Genetic Algorithm

Posted on:2011-11-07Degree:MasterType:Thesis
Country:ChinaCandidate:Y ZhangFull Text:PDF
GTID:2208360308467717Subject:Computer software and theory
Abstract/Summary:PDF Full Text Request
With the development of information technology, Data mining has become an active area of studying. Clustering is an important field of data mining not only for its theoretical value, but also for its wide applications. K-means clustering is the most popular clustering algorithm in cluster analysis. However its vital shortcomings are its sensibility to initial seeds and its easy falling into local solutions.In order to deal with the problems of K-means clustering algorithms, we did the following works in this thesis.Firstly, we proposed a novel rough K-means clustering algorithm based on the weight of density. This algorithm defined a new density function for each sample according to the denseness of samples around it, and selected the top K samples with higher density as initial centers for rough K-means,. which modified the method to select initial seeds randomly of available rough K-means algorithms. Further more this algorithm defined the new weight for each sample according its value of the novel density function, so that the better centroid of each cluster can be calculated. Experiments result proved that the algorithm we proposed has a better clustering result compared to the available rough K-means and rough K-means clustering algorithm with weight based on density proposed by Lingras et al and Zheng et al respectively, and does not influenced by noisy data.Secondly, we presented a new operator called splitting operators, and introduced the new operator into the simple genetic algorithms and the adaptive genetic algorithms to solve the primary problems in Genetic Algorithms, such as premature and the slowing convergence speed. These novel algorithms are compared with simple genetic algorithm and the adaptive genetic algorithm respectively by the testing on several functions. The results show that the new algorithms with splitting operators can not only converge to the global optimal solutions, but also make a progress in the convergence speed.Thirdly, we proposed an improved gradient operator, and introduced this new gradient operator to niche genetic algorithm and adaptive niche genetic algorithm respectively to avoid the premature of niche genetic algorithms and improve their convergence speed. In the process of these algorithms we used the dynamical crossover operators and the dynamical mutation operators. The test results on Shubert function show that the new algorithms cannot only improve the convergence speed, but also find all of the global optimal solutions.Finally, we proposed two improved K-Means algorithms based on niche genetic algorithms. The two K-Means clustering algorithms combine the local search capability of K-Means and the global optimization capability of niche genetic algorithms, such that they can overcome the sensitivity of K-means algorithm to initial seeds and its easily converging on local solutions. The most important thing of the two improved K-means algorithms is that they transformed the K-means clustering problem to an optimal problem with K optimal solutions, and solved it with niche genetic algorithms. Experiments on UCI data sets and on synthetic data set demonstrate that the algorithms we proposed can get better clustering results, and does not be influenced by noisy data.
Keywords/Search Tags:rough set, clustering, genetic algorithm, adaptive, niche
PDF Full Text Request
Related items