Font Size: a A A

Research On Clustering Algorithm Based On Density Peaks

Posted on:2019-08-10Degree:MasterType:Thesis
Country:ChinaCandidate:X N XueFull Text:PDF
GTID:2428330572951761Subject:Operational Research and Cybernetics
Abstract/Summary:PDF Full Text Request
As a core part of data mining,clustering analysis can excavate the potential valuable information by exploring the internal structure of data and the correlation between data without any prior knowledge,so it has been widely used in many fields such as text mining,bioinformatics,image processing and so on.Density Peak Clustering(DPC)algorithm is a novel clustering algorithm which has been highly recognized by many scholars,and it has the advantages of simple principle,easy realization and fast clustering.However,any single clustering algorithm is difficult to solve all clustering problems,for the limitation of DPC algorithm,the improved algorithms are discussed in this paper.The main work of this paper is as follows:1.An improved density peaks clustering algorithm combining K-nearest neighbors is proposed.In DPC algorithm,because the density calculation methods are inconsistent,the assignment strategy for the remaining points can bring error propagation easily,and the clustering quality is lower,so an improved density peaks clustering algorithm(IDPC)is proposed,which gives a new local density calculation method and designs two different allocation strategies of the remaining points by using the idea of K-nearest neighbors and queue.Numerical experiments on 21 different datasets are designed to compare IDPC with DPC,AP,DBSCAN,K-means and FKNN-DPC,which show that IDPC algorithm is superior to other approachs in clustering quality and efficiency.2.A density peaks clustering algorithm based on K-nearest neighbors and classes-merging is proposed.Since DPC has poor performance on the datasets with complex structure,high dimensionality and whose one cluster contains several density peaks,a novel density peaks clustering algorithm is presented,known as KM-DPC algorithm,which not only modifies the evaluation index of cluster centers and the allocation strategy of remaining points,but also gives a new classes-merging strategy.Comparisons between KM-DPC and other IDPC,DPC,AP,DBSCAN,K-means and FKNN-DPC,and the experimental results of 22 benchmark datasets show that KM-DPC has obvious advantages.3.An identification algorithm of eye movement data based on density peaks is developed.DPC and its improved algorithms are unable to deal with the actual human eye movement data which has high degree of crossover and overlap among classes,so a new method for identifying the eye movement data is proposed,known as Eye DP algorithm.For the core idea of Eye DP algorithm,firstly by using the given distance threshold method to extract the saccade data,then combining the thought of density peaks with K-nearest neighbors to design the allocation strategy of remaining points and the classes-merging condition,which identify the fixation data and the pursuit data,and finally introduce the faulttolerant processing steps to reduce the error rate.Compared Eye DP with the classical identification algorithm on 11 groups of real eye movement data,the clustering effect and the performance benchmarks show that Eye DP performs well in quality.Overall,the improved algorithms proposed in this paper have achieved good results,but there have some problems,such as the selection of cluster center requires manual intervention,single similarity measurement and the operation efficiency of the algorithm to be researched,which are the main tasks of the next step.
Keywords/Search Tags:Clustering, DPC algorithm, K-nearest neighbors, Classes-merging, Eye movement data
PDF Full Text Request
Related items