Font Size: a A A

Research On Improved Density Peaks Clustering Algorithm

Posted on:2020-04-07Degree:MasterType:Thesis
Country:ChinaCandidate:K ZhouFull Text:PDF
GTID:2428330623457408Subject:Software engineering
Abstract/Summary:PDF Full Text Request
Clustering analysis is one of the mainstream technologies of data mining.Clustering algorithms have the characteristics of “diversity” and “targeting”.Diversity refers to a wide variety of algorithms.Targeting means that the application scenarios of the algorithm are targeted.According to different metrics,the clustering algorithm can be divided into the following types: partitioning clustering,hierarchical clustering,density clustering,grid clustering,model clustering,etc.Density Peaks Clustering is an efficient density-based clustering algorithm.This thesis studies and analyzes the disadvantages of Density Peaks Clustering algorithm,and proposes two improved algorithms.The mainly work are as follows:(1)For the density peak clustering algorithm,the abnormal point is assigned to the cluster closest to it,and the problem of inaccurate cluster number is detected by 2D decision graph.A density peak clustering algorithm based on universal gravitation theory is proposed to apply the universal gravitation theory to the density.The peak clustering algorithm is used to enhance its ability to detect anomalies.The gravitational theory is used to optimize the decision graph.The reciprocal of the parameter gravity is used instead of the distance as the ordinate of the decision graph,so that it has the ability to accurately identify the anomaly and the centroid.(2)For a single cluster containing multiple density peaks,the density peak clustering algorithm considers each different density peak as a potential clustering center,and it is difficult to determine the correct number of clustering in the data set.A hybrid density peak clustering based on CURE is proposed.algorithm.Firstly,the density peak is found as the initial cluster center,and the data set is divided into sub-clusters.Then,the hierarchical clustering algorithm CURE(Clustering Using Representative)is used to select the scattered representative points from the sub-cluster,and the representative points with the smallest distance are Classes are merged and a parameter shrink factor is introduced to control the shape of the class.The two algorithms proposed in this thesis are compared with the original density peak clustering algorithm and other classical clustering algorithms in the synthetic data set and UCI data set respectively.According to the experimental results,compared with the original density peak clustering algorithm and other classical clustering algorithms,when various types of data set are processed,the improved algorithm proposed in this thesis can have the ability to recognize clusters of arbitrary shapes,different sizes and different density groups,can accurately identify cluster numbers,and detect abnormalities.The clustering effect is better.
Keywords/Search Tags:Cluster analysis, Density peaks, Cluster merging, Contraction Factor, Gravitation
PDF Full Text Request
Related items