Font Size: a A A

The Research And Application Of Density Peak Clustering Algorithm

Posted on:2021-07-31Degree:MasterType:Thesis
Country:ChinaCandidate:H L WeiFull Text:PDF
GTID:2518306095975769Subject:Computer technology
Abstract/Summary:PDF Full Text Request
Clustering is a dynamic field of research in data mining.It is related to unsupervised learning in machine learning.Clustering analysis,as a very important data mining tool,has been applied in many fields,such as biology,security,business intelligence,image pattern recognition and web search.Aiming at the shortcomings of the density peak clustering algorithm(DPC)that cannot automatically determine the cluster centers,the selected centers may fall into a local optimal and the random selection of the parameter cut-off distance d_cvalue,the corresponding improvement algorithms are proposed,and the improved clustering algorithm is used to clustering analysis of the stellar spectral data observed by LAMOST.The specific research contents include the following aspects:(1)Aiming at the defect that the DPC algorithm needs human factors to select the cluster center,a rapid method of determining density peak cluster centers based on exponential distribution(EDPC)is proposed.First,by comparison,a suitable calculation method for local density is obtained,and the product of density and distance is used as a judgement index;Second,the data above the upper bound determined by an exponential distribution are defined as the cluster center points;Finally,each remaining point is assigned to the closest neighbor of higher density.Theoretical analysis and experimental results show that the algorithm can automatically and effectively select the cluster centers,overcomes the influence of subjective factors and improves clustering efficiency of the algorithm.(2)Analyzing the DPC algorithm,we find that the selected cluster centers may fall into a local optimum and the random selection of the parameter cut-off distance d_cvalue,a novel clustering algorithm based on DPC&PSO(PDPC)is proposed.First,in order to reduce the influence of the parameter d_con the clustering results,a new method for calculating the parameter d_cis proposed;Second,based on the judgment index proposed in(1),a new fitness function is proposed,on this basis,K initial center points can be found by the PSO algorithm,then,the clustering iteration process is performed and the new cluster centers are calculated;Finally,the results of the comparison experiment show that PDPC is more accurate than the clustering results of the other six algorithms,and improves the efficiency of the algorithm.The algorithm also effectively solves the defect that the cluster center selected by the DPC algorithm may fall into local optimum,and overcomes the influence of the parameter d_con the clustering result.(3)Based on the above research,using LAMOST spectral data as the application background,the clustering analysis was performed on the stellar spectral data using the proposed clustering method.The running results show that the method is effective for clustering analysis of stellar spectral data.
Keywords/Search Tags:Clustering, Density peak, Exponential distribution, Particle swarm optimization, Stellar spectral
PDF Full Text Request
Related items