Font Size: a A A

Research On Density Peaks Clustering

Posted on:2019-12-29Degree:DoctorType:Dissertation
Country:ChinaCandidate:M J DuFull Text:PDF
GTID:1368330566463044Subject:Computer application technology
Abstract/Summary:PDF Full Text Request
Density peaks(DP)clustering is a density-based clustering algorithm.Density peaks clustering algorithm provides many advantages,including implementation simplicity,less control parameters,and so on.Nowadays,DP clustering has become one of the popular research areas.DP clustering is a relatively new theory and is still in the development stage.Thus,there are many problems remained for further study.This paper focuses on the shortcomings of the density peaks clustering algorithm and studies the corresponding improvement methods from the aspects of improving the performance,enhancing the robustness,and extending the ability.The specific research contents are as follows:1.Study the density peaks clustering based on k nearest neighbors and principal component analysis.Because the local density based on ?-neighbors has weaker robustness.This may affect the the performance and availability of the density peaks algorithm.Besides,the method based on ?-neighbors is prone to suffer from curse of dimensionality.In order to overcome the problem,the idea of k nearest neighbors(KNN)is introduced into density peaks clustering and a density peaks clustering based on k nearest neighbors(DPC-KNN)is proposed.Furthermore,redundant attributes may affect the clustering performance.Principal component analysis is introduced into DPC-KNN,and DPC-KNN-PCA is further proposed.2.Study the density peaks clustering based on geodesic distances.In order to objectively reflect the mandifold structure of data,geodesic distance commonly used in mandifold learning is introduce into the distance computation.Nonlinear distance between faraway points can be approximated by adding up a seque nce of “short hops” between neighboring points.In order to better process data containing multiple manifold structures,the distance measure is introduced into the de nsity peak clustering algorithm and a density peaks clustering based on geodesic distances is proposed.3.Study the density peaks clustering based on sensitivity of local density and density-adaptive metric.Density peaks clustering is unsuitable for dealing with complex structure of data.Another option for of local density based the the sensitivity is defined.In order to reflect the complex structure of data,a density-sensitive distance measure is defined.This method can squeezes the distance between data points in high density region.From a different perspective,the distance between cluster centers is comparatively magnified the distance between data points in low density region.On the basis of the two ideas,a density peaks clustering based on sensitivity of local density and density-adaptive metric is proposed.4.Study the density peaks clustering for mixed type data.Density peaks clustering works only on numerical values.For this,a similarity metric which can be applied to mixed type data using the entropy-based criterion is defined.To further improve the feasibility and performance of density peaks clustering,fuzzy neighborhood relation is applied to redefine the local density.Furthermore,an automatic cluster center selection method is developed.On the basis of these strategies,a density-based clustering algorithm for mixed type data is proposed.This method can deal with three types of data: numerical,categorical,and mixed type data.
Keywords/Search Tags:clustering analysis, density peak clustering, density-based clustering, similarity measure
PDF Full Text Request
Related items