Font Size: a A A

Research On Evolutionary K-Means Algorithm

Posted on:2014-12-12Degree:MasterType:Thesis
Country:ChinaCandidate:L Z WangFull Text:PDF
GTID:2308330461473375Subject:Computer application technology
Abstract/Summary:PDF Full Text Request
Clustering is an important task of data mining, which is widely used in pattern recognition, machine learning, image processing, and other fields. It can be seen as a combinatorial optimization problem, many scholars have applied the evolutionary algorithms, which has the ability of global optimization, for clustering research. K-Means is a classic clustering algorithm, but it needs to specify the number of clusters k, and is sensitive to the initial cluster centers. In order to overcome the impact of the initial cluster centers for K-Means clustering results, scholars proposed genetic K-Means clustering algorithm called GKA, which used the evolutionary idea to optimize the cluster centers.However, GKA is only for clustering problems with the fixed k. In order to determine k adaptively, Hruschka designed a novel evolutionary K-Means clustering algorithm called EAC (Evolutionary Algorithm of Clustering), which abandoned the complex crossover operator, used two special mutation operators to evolve the cluster centers and numbers, and took K-Means as a local search operator. But EAC had no guidance in the choice of mutation objects and operators, which are randomly selected. Alves and Naldi chose clusters for mutation through the quality of them, and chose mutation operators through their performances, so F-EAC (FastEAC) algorithm was presented, whose effectiveness was analyzed.In the framework of F-EAC, this thesis has studied the mutation operators deeply from different angles to make improvements, and researched the parallel algorithm of F-EAC on a multi-core machine. I have mainly completed the following four aspects:First, some key elements of evolutionary clustering algorithm are introduced and analyzed, such as encoding schema, evolutionary operator, fitness function, local search etc. Second, according to the randomness and the locality of splitting process, I have presented two kinds of more effective global splitting operators, which use the idea of max-min distance and the information of peripheral clusters to guide the selection of the initial splitting centers, in order to make splitting process more beneficial to global partition, enhancing the precision of evolutionary clustering. Third, due to the lack of completeness and appropriate analysis of evolutionary operators, I have studied the status of mutating cluster with different characters, and presented a merging operator, which is adaptively applied to the corresponding mutation clusters, achieving a more effective clustering. Fourth, according to the computational complexity of evolutionary operations and K-Means, and the character that evolutionary algorithm is apt to parallel, I have proposed a parallel F-EAC algorithm, which is based on multi-core processor and the programming interface of OpenMP, reducing the running time greatly.
Keywords/Search Tags:K-Means, evolutionary algorithm, global splitting, adaptive merging, parallel algorithm
PDF Full Text Request
Related items