Font Size: a A A

Research On Improved Density Peak Clustering Algorithm

Posted on:2020-04-21Degree:MasterType:Thesis
Country:ChinaCandidate:L N ZhangFull Text:PDF
GTID:2428330602452133Subject:Engineering
Abstract/Summary:PDF Full Text Request
Clustering by fast search and find of density peaks is a novel clustering algorithm,which can identify cluster centers and cluster objects efficiently and effectively with a few control parameters,regardless of the shape and dimensions of datasets.Based on the above advantages,the peak density clustering algorithm provides a new solution to many practical problems.With the application of peak density clustering algorithm in more fields,it has become a research hotspot in the field of clustering.However,there are still some shortcomings that it can not automatically identify the cluster centers without the help of a professional and can not handle the datasets with large density differences between clusters.To improve the performance of the density peak algorithm,the improved algorithms are proposed in this paper.The main research contents are as follows:(1)To identify cluster centers,a novel density peaks clustering algorithm by finding cluster centers automatically is proposed.First,a density measurement method based on contribution is designed for datasets with different sizes,which measures the density of data points precisely and optimizes the distribution of decision graph.Then,according to the distribution characteristics of density and distance on decision graph,a new clustering center selection method is designed,which can automatically select data points with larger density and distance as local cluster centers.Finally,based on the shared boundary density of local cluster pairs,local clustering is automatically merged into global clustering.The experimental results show that the proposed algorithm can not only identify the local cluster centers automatically,but also merge the local clustering into global clustering accurately,achieving automatic clustering of datasets.It completely solves the problem that the density peak algorithm needs to manually select cluster centers in the clustering process,and it has a more obvious clustering effect for the type of datasets with multiple density peaks.(2)To deal with datasets with a large density difference between clusters and improve allocation strategy,a density peak clustering algorithm based on K-nearest neighbors is proposed.The idea of K nearest neighbor is integrated into density metric and allocation strategy,so as to reduce the influence of density information on the selection of cluster centers and the distribution of data points,solving the problem of misclassification of sparse cluster centers non-central data points.First,a new density metric method combining the distribution surrounding information around data points and the average K-nearest distance of data points is proposed.The density metric method can enlarge the density of the sparse clusters while ensuring the density of the original dense regions is still high,which effectively reduces the impact of different densities of data between clusters on the identification of cluster centers.In addition,the new allocation strategy adopts a method to assign data points to different clusters by distinguishing boundary points.First,the breadthfirst strategy is used to complete the clustering of non-boundary data points,and at the same time,multiple cluster centers presenting in the same cluster are deleted automatically.Then,according to the clustering of the neighbor points,the allocation of the remaining data points is completed gradually.By adopting this allocation method,the multi-density peak problem existing in a certain cluster can be solved without subsequent merging operation,and the chain-type misclassification problem generated by the original algorithm is avoided effectively.The experimental results on several artificial datasets and UCI real datasets verify the efficiency and feasibility of the improved algorithm.
Keywords/Search Tags:Density Peak Clustering, Cluster Centers, K-nearest Neighbors, Local Density, Allocation Strategy
PDF Full Text Request
Related items