Font Size: a A A

Research On Improvement Of Clustering Algorithm Based On Density Peaks

Posted on:2021-09-05Degree:MasterType:Thesis
Country:ChinaCandidate:L ZhaoFull Text:PDF
GTID:2518306032959559Subject:Computer system architecture
Abstract/Summary:PDF Full Text Request
With the rapid popularization and development of computer technology in the world in recent years,large-scale,high dimension,multi-type mixed data are generated every moment,and the rate of data generation is also increasing,we are stepping into the era of 'big data' defined by information explosion.At this time,the traditional method is not enough to deal with the huge data in the era of 'big data'.How to deal with the big-scale data and mining useful information from the big-scale data rapidly and effectively has becoming an important research topic in the field of computer science,the State Council has raised the artificial intelligence,one of the important development direction of computer science to a national strategic level in 2017.Clustering analysis is one of the mainstream technologies and important tools in machine learning and data mining.It refers to the process of dividing data objects into multiple clusters by some methods.It has been applied to many fields in society,such as image pattern recognition,etiology analysis,business intelligence,intelligent medical treatment,intelligent agriculture and other aspects.For different content of clustering analysis,clustering algorithms can be basically divided into the following five categories:clustering based on the division method,clustering based on the hierarchical method,clustering based on the density method,clustering based on the grid method,model-based Method clustering,etc.Clustering by fast search and find of density peaks is a novel and efficient clustering algorithm based on the density method,which has the advantages of easy implementation and fewer parameters,but the algorithm is currently in the preliminary development stage,and there are many problems that need further research and improvement.Two improved algorithms are proposed to overcome the shortcomings of the clustering by fast search and find of density peaks in this paper.The specific improvement works are as follows:(1)For the problem of low calculation efficiency in processing large-scale data sets using clustering by fast search and find of density peaks algorithm,density peaks clustering algorithm based on grid pre-screening strategy is proposed in this paper.First,we divide the data objects space in multiple and same grid space,map the data objects into corresponding grid space,and calculate the density of each grid,reduce the size of cluster center candidate by filter out the data objects in the grid with low density,thereby reduce the time consumption in the clustering process,speed up the realization of clustering algorithm.In this paper,extensive experiments have been carried out on artificial data sets and real-world data sets,and the results show that the GPDPC algorithm can reduce the time consumption to a greater extent under the premise of ensuring clustering accuracy,and realize the clustering process faster.(2)For the problem of poor clustering effect caused by the density peaks selecting the clustering center by mistake,an adaptive density peaks clustering algorithm based on density-reachable is proposed,which is drawn on the concept of density-reachable in DBSCAN algorithm.First,divide the data objects space into multiple and same grids,calculate the density and distance of every data object in each grid,then select the data object with highest density and biggest distance as the initial cluster center,assign the cluster labels,at last,merge the certain clusters by the method of density-reachable.Experiments show that the adaptive density peaks clustering algorithm based on density-reachable not only can automatically complete the clustering,but also outperforms other competitive methods...
Keywords/Search Tags:Cluster analysis, density peak, grid pre-screening, density-reachable
PDF Full Text Request
Related items