Font Size: a A A

Research On Density Peak Clustering Algorithm Based On Adaptive Reachable Distance

Posted on:2022-01-04Degree:MasterType:Thesis
Country:ChinaCandidate:M ZhangFull Text:PDF
GTID:2558307070456254Subject:Statistics
Abstract/Summary:PDF Full Text Request
Density peak clustering algorithm(DPC)is one of the hotspots in the theoretical and applied research of cluster analysis.And it has the advantage of rapid search and discovery.The algorithm can quickly determine the cluster center and the number of clusters by using the density and the distance attributes of sample points.In addition,it is able to find clusters of arbitrary shape and outliers.However,there are shortcomings such as manual selection of parameters and difficulty in dividing complex density clusters.Firstly,concern the problem that the DPC algorithm requires manually selected cutoff distance and the error by the nearest neighbor assignment,a density peak clustering algorithm based on adaptive reachable distance(ARD-DPC)is proposed.The algorithm adopts the non-parametric kernel density estimation method to calculate the local density of data points,and obtains the final clustering results by using adaptive reachable distance to allocate data points.The experimental results show that that compared with the DPC algorithm,the ARD-DPC algorithm can accurately identify the number of clusters and clusters with complex density.Secondly,in order to solve the calculation error of the adaptive reachable distance in the ARD-DPC algorithm,we proposed the IARD-DPC algorithm.The algorithm uses a new calculation method to obtain the adaptive reachable distance.The performance index of the IARD-DPC algorithm is compared with ARD-DPC algorithm through the simulation experiment on the synthetic data set,which is verified the validity of IARD-DPC algorithm.Finally,in view of the problem that the IARD-DPC algorithm is difficult to deal with false merging of weakly reachable clusters,we further propose the ITARD-DPC algorithm based on the concept of strong core points and weak core points.The over-merging of clusters is avoided by truncating the threshold.Through simulation experiments on the Aggregation data sets,it is verified that the strong core points and weak core points proposed by the algorithm can effectively deal with weakly reachable clusters.In addition,simulation experiments are performed on 7 synthetic data sets.The proposed algorithm is compared with DPC,ARD-DPC and IARD-DPC algorithm.The experimental results show that the ITARD-DPC algorithm has higher feasibility,rationality and effectiveness.
Keywords/Search Tags:clustering algorithm, density peak, cutoff distance, non-parametric kernel density estimation, adaptive reachable distance, truncation threshold
PDF Full Text Request
Related items