Font Size: a A A

Research On Clustering Method Based On Multi-objective Genetic Algorithm

Posted on:2021-04-23Degree:MasterType:Thesis
Country:ChinaCandidate:L YinFull Text:PDF
GTID:2428330614958391Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
As an unsupervised approach in machine learning,clustering is an important way to understand and learn structural information from data.One of the most widely used clustering algorithms is the prototype-based clustering method,which has been widely used in image segmentation,text analysis,gene analysis and social network,etc.Howerver,it has two defects.The first one is that it needs to know the number of clusters in advance,but the number of clusters is often unlikely to be specified with prior knowledge.The second one is that the algorithm is sensitive to the initial clustering center,which leads to the instability of clustering results.It has great potential to solve these problems by introducing genetic algorithm into clustering.This thesis studies the clustering method based on multi-objective genetic algorithm,and discusses the advantages of multi-objective method compared with single objective method.Finally,this thesis proposes a new clustering method NSGAII-GR(Non-dominated Sorting genetic Algorithm-II using Gene Rearrangement).NSGAII-GR is guided by the objective functions.Through the evolutionary algorithm,individuals representing the clustering are generated and screened,so that they can perform the best in the objective functions as much as possible.This method can adaptively determine the number of clustering clusters and obtain stable and good clustering results.This thesis mainly carrys on the following aspects research:1.The framework of clustering model based on multi-objective genetic algorithm is studied.Using the framework,a prototype clustering method based on multi-objective genetic algorithm is implemented.It does not need to specify the number of clusters in advance,but also can get good clustering results.Compared with the single objective method,the multi-objective method has better performance.2.In this thesis,the selection of objective function in the algorithm is discussed.The sum of generalized sample variance is used as one objective function,and Calinski-Harabasz index is used as another objective function.The range of cluster number in the solution set is controlled in a more reasonable range,which makes the algorithm easier to get the correct cluster number.Then,a second stage selection operator is designed to determine the optimal number of clusters without using any prior information.And three recommended solutions are selected as the output from the number-uncertained solution set.3.For the above method,the clustering method based on multi-objective genetic algorithm is improved.A gene rearrangement technology combined with cluster merging is proposed to speed up the convergence of the algorithm and achieve better clustering results.NSGAII-GR,an improved clustering method based on multi-objective genetic algorithm,can adaptively determine the number of clusters in artificial datasets and real datasets without using any prior information.What's more,it can obtain good and stable clustering results.
Keywords/Search Tags:Clustering, Genetic algorithm, Multi-objective optimization, Generalized sample variance, Gene rearrangement
PDF Full Text Request
Related items