Font Size: a A A

Improvement And Application Of Density Peaks Clustering

Posted on:2022-01-23Degree:MasterType:Thesis
Country:ChinaCandidate:D JiangFull Text:PDF
GTID:2518306332487884Subject:Management Science and Engineering
Abstract/Summary:PDF Full Text Request
The clustering algorithm is one of the hot research topics in the field of data mining.Clustering itself or in combination with other algorithms has been widely used in many fields such as biomedicine,economic linguistics,psychology,and so on.Therefore,it is of great theoretical and practical significance to obtain high-quality data clustering results through clustering.Density Peaks Clustering(Density Peaks Clustering,DPC)is a density-based clustering method.In 2014,Alex Rodriguez and Alessandro Laio published a paper on Science called Clustering by fast search and find of density peaks,(DPC).This method innovatively integrates the concepts of local density and distance and makes innovations on them.It can not only divide data sets of different shapes,but also puts forward a relatively special residual point allocation strategy,and efficiently eliminates outliers.However,it also has disadvantages.This paper mainly makes improvements in the following aspects:Streamlined datasets are one of the common types of datasets.Streamlined data sets often have a situation where the distance between two points is relatively close,but they belong to two clusters.Therefore,a DPC algorithm based on the shortest path is proposed.The algorithm separates the two points in the above situation by a defined ‘gap’,leaving only a distance path between the nearest neighbors to the point,and finds the shortest path reconstruction distance between the two according to the shortest path algorithm matrix.The DPC algorithm based on the shortest path improves the efficiency of the DPC algorithm in processing streamlined data sets.DPC assigns points to clusters of points with high density around them.This special assignment method is likely to cause a domino effect.Therefore,we combine the DPC algorithm with the improved FCM which is called IFCM-DPC.IFCM-DPC using the ability of DPC to quickly find the cluster centers,and then use the improved FCM algorithm for subsequent clustering,which speeds up the number of convergence to a certain extent and improves the accuracy of clustering.DPC depends on the choice of the threshold,how to determine the threshold according to the data distribution in the field.An adaptive density peak clustering based on K-nearest neighbors and Gini coefficients(G-KNN-DPC)is proposed,which uses the data distribution characteristics in the data field to introduce the Gini coefficient for the calculation of the threshold;and then uses the KNN algorithm to reconstruct the distance matrix.The experimental results of 13 data sets show that the operation effect of G-KNN-DPC is excellent.The MRI image is segmented by the IFCM-DPC algorithm.In this method,appropriate initial parameters are selected first,and then the image is segmented by the algorithm.Compared with K-means clustering and fuzzy C-clustering,IFCM-DPC can complete image segmentation with high quality.
Keywords/Search Tags:Density peak clustering, Shortest path algorithm, K-nearest neighbor algorithm, Image segmentation
PDF Full Text Request
Related items