Font Size: a A A

Research On Density Clustering Algorithm Based On Grouping

Posted on:2024-03-20Degree:MasterType:Thesis
Country:ChinaCandidate:Y B LiFull Text:PDF
GTID:2568307124963749Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
Clustering algorithm has always been an effective means to solve the problem of unlabeled data labeling.Because density-based clustering algorithm can identify non spherical data sets,it has always been a research hotspot in clustering algorithms.For the density clustering algorithm,the similarity between each sample needs to be calculated,and the density radius is difficult to get.This paper presents a series of improvement methods.The contents of this study are as follows:(1)To solve the problem that density peaks clustering(DPC)requires manual selection of centers and need to calculate the similarity between each sample when calculating density,this paper proposes a density clustering algorithm based on grouping and transitivity(GDC).Firstly,group the data and calculate the density of each group,and divide the group into core groups or boundary groups according to the threshold value;Secondly,combining two cores whose distance is less than a threshold value;Thirdly,combine the boundary group whose distance from the sample in the cluster is less than the threshold value;Fourthly,allocate the unassigned boundary group with a non-zero density to its nearest center corresponding cluster;Finally,label groups that are still not assigned as noise.Through experimental analysis,GDC has improved its accuracy compared to the comparison algorithm.(2)To solve the problem of GDC’s difficulty in processing complex data with multiple noises and DBSCAN’s excessive recognition of noise,this paper proposes a density clustering algorithm based on grouping and minimum spanning tree(MST-GDC).Firstly,group the data and calculate the density of each group,and divide the group into core groups or boundary groups according to the threshold value;Secondly,using core groups to construct a minimum spanning tree,edges with weights greater than the threshold value are “cut”;Thirdly,divide the boundary group into clusters whose distance is less than the threshold value;Finally,label the remaining boundary groups as noise.Through experimental analysis,MST-GDC has improved the accuracy compared to the comparison algorithm.(3)To solve the problem that the efficiency of MST-GDC is still not high enough,this paper proposes a density clustering algorithm based on grid and minimum spanning tree(MST-GRDC).Firstly,by constructing a grid to quickly complete grouping,the group is divided into core groups or boundary groups according to the threshold value;Secondly,search for adjacent core groups to quickly construct a minimum spanning tree and complete “cut”;Finally,use the same method as MST-GDC to complete the allocation of boundary groups.Through experimental analysis,MST-GRDC effectively solves the problem of MST-GDC in this paper while ensuring the accuracy of the algorithm.
Keywords/Search Tags:density clustering, grouping clustering, grid clustering, minimum spanning
PDF Full Text Request
Related items