Research And Improvement Of Density Peak Clustering Algorithm

Posted on:2024-04-15

Degree:Master

Type:Thesis

Country:China

Candidate:M Zhou

Full Text:PDF

GTID:2558306917984599

Subject:Mathematics

Abstract/Summary:

With the increasing size of information network technology,in the face of complex,diversified and quantified data,how to analyze and handle them productively has become the priority of study in today’s era,and accelerated the pace of research on clustering algorithms in the field of machine learning,so that the density peak clustering algorithm(Clustering by Fast Search and Find of Density Peaks,DPC)came into being.This paper focuses on the DPC algorithm,as soon as it is presented that it gained the concentration and study of scholars.The reason is that its benefits for instance higher efficiency in implementation and novel design concept,ability to handle nonlinearly divisible datasets,fast identification of clustering centers through decision diagrams,and insensitivity to outliers.However,the DPC algorithm also has some defects such as higher computational complexity,subjective selection of clustering centers,and data assignment prone to collateral error.Therefore,this paper designs two improved DPC algorithms based on the above mentioned deficiencies,as follows:To address the problems of poor performance of the DPC algorithm in dealing clusters with multiple density peaks,empirical choice of clustering centers based on decision diagrams,and unrobust data assignment process.A new density peak clustering algorithm based on cluster fusion strategy is proposed.Firstly,the algorithm screens out the candidate clustering centers by setting two new thresholds to avoid the effect of noise points and outliers.Secondly,the structural characteristics and spatial distribution of the dataset are considered,new definitions of boundary points,inter-cluster intersection density and inter-cluster boundary density are given.To correctly classify clustering problems with multiple density peaks in the same cluster,a new clustering fusion strategy is designed,which not only correctly selects the cluster centers but also corrects the collateral errors in the data point assignment process.Finally,experimental tests are conducted,and the results indicate that the new algorithm utmostly enhances the clustering accuracy and robustness.To address the problems that the DPC algorithm has poor clustering performance when dealing with unevenly distributed datasets,the calculation of distance only considered in the algorithm ignores the correlation between samples,and the acquisition of clustering centers by intuition based on decision diagrams.A density peak clustering algorithm based on shared neighborhood is presented.Firstly,the information about the neighbors of the data points and the degree of relationship between the data are considered,and the local density is redefined according to the shared neighborhood.Secondly,a new decision threshold is designed as the threshold value to distinguish the clustering centers and non-clustering centers,and the clustering centers are automatically obtained to avoid the influence of human intervention.Finally,comparison experiments are set up.The results indicate that the new algorithm enhances the accuracy and stability while maintaining the original complexity.

Keywords/Search Tags:

density peak clustering, candidate cluster center, cluster fusion strategy, local density, decision threshold

Related items

1	Research On Improved Density Peak Clustering Algorithm
2	Research Of Clustering Algorithm Based On Data Local Distribution
3	Research On Two Improved Density Peaks Clustering Algorithms
4	Density Peak Clustering Algorithm Based On Adaptive Cluster Center
5	Research On Density Peak Clustering Algorithm By Automatically Determining Clusters
6	Research On Improved Density Peak Clustering Methods Based On K-nearest Neighbors
7	Improvement Of Density Peak Clustering Algorithm And Its Customer Segmentation Application
8	The Research And Implementation Of A Novel Text Clustering Algorithm Based On Density Peak
9	Manifold Density Peak Clustering Algorithm And Its Application Of Weibo Text Classification
10	Research On Clustering Algorithm Based On Density Peak And Its Application In Text Clustering