Font Size: a A A

Analysis And Design Of K-Means Algorithms Based On Swarm Intelligence

Posted on:2024-07-26Degree:MasterType:Thesis
Country:ChinaCandidate:S L ZhongFull Text:PDF
GTID:2568307091988239Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
With the advent of the era of big data,in the face of massive,complex,and diverse data sets,how to better process and analyze the information and value in these data has become an important support and guidance for decision-making and innovation.Cluster analysis is one of the important means of processing and analyzing these data.Although there are many clustering analysis algorithms,several of them exhibit limitations and deficiencies when confronted with diverse tasks.This thesis primarily focuses on investigating three key issues:firstly,certain existing K-Means algorithms demonstrate sensitivity to the initial centroid selection and lack global search capabilities;secondly,as the data volume and dimensionality increase,the time overhead of some particle swarm-based optimization algorithms and genetic clustering algorithms becomes unacceptable;and thirdly,the majority of K-Means type methods tend to exhibit poor performance when handling imbalanced data.Consequently,this thesis focuses on discussing these aforementioned challenges,presenting research efforts,and highlighting innovations in the following aspects:(1)A clustering algorithm based on particle swarm optimization with empty cluster reassignment for Ball K-Means is proposed,called "PSO Ball K-Means".The algorithm combines the random search strategy of Particle Swarm Optimization(PSO)with the Ball KMeans method,addressing the issues of sensitivity to initial centroids and lack of global search capability in the Ball K-Means algorithm.Additionally,it proposes a strategy for reassigning empty clusters,effectively fixing the problem of empty clusters arising during the random search process.Experimental results show that the clustering results of the algorithm will not be affected by the initial cluster centroid,and the best clustering accuracy is obtained on multiple datasets,especially when the size and dimension of the data set increase.Ablation experiments show that the empty class reassignment strategy can improve the quality of clustering results.(2)A Ball K-Means clustering algorithm based on exploratory vectors,named "Ball XKMeans," is proposed.Ball XK-Means improves the global search ability of the algorithm by adding exploratory disturbance vectors on the cluster centroids.At the same time,the algorithm also uses the idea of "ball cluster",which reduces the calculation of the distance between the data and the centroid,and improves the efficiency of the algorithm.Experimental results show that this algorithm has more stable and accurate clustering results than Ball KMeans and K-Means algorithms,and has lower computational cost and higher efficiency than K-Means and XK-Means algorithms,especially in dealing with large-scale,high-dimensional data.(3)A fast self-adaptive multi-prototype clustering algorithm is proposed,called FSMC,based on the ball cluster idea for handling imbalanced data.FSMC enhances the learning rate of the prototype and reduces computational costs by combining the mean update strategy with the idea of "ball cluster".Additionally,the algorithm defines the stability of cluster structures and introduces a new cluster splitting strategy based on stable and active areas to counteract the K-Means algorithm’s uniform effect when dealing with imbalanced data.Experimental results demonstrate that FSMC performs well on accuracy,NMI,and DCV metrics,accurately identifying the number of true clusters.Furthermore,compared to the SMCL algorithm,FSMC exhibits lower computational cost and CPU time overhead,particularly when dealing with increased data size or dimensionality,showcasing its advantages more prominently.
Keywords/Search Tags:Cluster analysis, Swarm intelligence, K-Means type, Imbalanced clustering, Multi-prototype, Self-Adaptive clustering
PDF Full Text Request
Related items