Font Size: a A A

Research On Hierarchical Clustering Algorithm Based On Density Peaks

Posted on:2022-03-18Degree:MasterType:Thesis
Country:ChinaCandidate:Q Y TianFull Text:PDF
GTID:2518306512997029Subject:Information management and information systems
Abstract/Summary:PDF Full Text Request
Clustering by fast search and find of density peaks(DPC)is an effective algorithm to find the cluster centers and complete clustering based on local density and relative distance.The algorithm can present the hierarchical relationship between data by mapping high-dimensional data into two-dimensional space,and has the advantages of fewer parameters,simple principle,and the ability to find any shape of clusters.Density peak clustering algorithm is widely used in different fields because of its good clustering performance on data sets,and has become a research hotspot in the field of data mining.The efficient data allocation strategy of density peak clustering algorithm provides a new idea for clustering analysis,but this allocation strategy is easy to cause error propagation and performs poorly on data sets with large density differences between clusters.To solve this problem,this paper designs two improved algorithms to improve the clustering effect of density peak clustering algorithm.Specific research contents are as follows:A density peak clustering algorithm based on graph segmentation is proposed to solve the problem of error propagation when there are multiple density peaks in a cluster.Firstly,clustering is carried out according to the allocation strategy of density peak clustering algorithm,and the original data will be gathered into multiple local class clusters,and each local density peak point will guide a class cluster.Then,the boundary samples of each local cluster are defined,and an effective allocation strategy is designed to recluster the boundary samples.Finally,using the idea of spectral clustering for reference,each local cluster is regarded as a vertex,the weight of edges between vertices is defined to form a connected graph,and the final clustering is completed by minimizing the cutting edge.The experimental results show that the proposed density peak clustering algorithm based on graph segmentation effectively overcomes the error propagation phenomenon in the density peak clustering algorithm,and accurately merges local clusters into global clusters.When there are multiple density peaks in a certain type of cluster in the data set,the clustering effect is more obvious.To solve the problem of poor performance of density peak clustering algorithm on data sets with large density differences among clusters,an optimized density peak clustering algorithm was proposed.This algorithm still adopts the local clustering and boundary point reallocation strategies of the density peak clustering algorithm based on graph segmentation.In the process of cluster merging,the connectivity between classes and the number of sample points in each cluster are considered comprehensively,and the final cluster is completed through the idea of iteration.Through experiments on several artificial data sets and UCI data sets,the proposed algorithm is compared with the classical DPC algorithm,DBSCAN algorithm,E?DPC algorithm and K-means algorithm.The experimental results show that the proposed algorithm has higher clustering accuracy and better robustness.
Keywords/Search Tags:Data Mining, Density Peak Clustering, Spectral Clustering, Allocation Strategy
PDF Full Text Request
Related items