Font Size: a A A

Study On Clustering Algorithm Based On Density Core

Posted on:2021-12-24Degree:MasterType:Thesis
Country:ChinaCandidate:Z M XiaFull Text:PDF
GTID:2518306107993509Subject:Engineering
Abstract/Summary:PDF Full Text Request
With the popularization of mobile Internet and Internet of Things Technology,the production mode of data has been greatly affected.The size,dimension and variety of data are constantly expanding,and the complexity of data is rising.In the era of big data,the data itself is becoming more and more important,it is crucial to mine potential and useful information from massive data.Cluster analysis is one of the main means of data mining.Its goal is to divide data objects into different clusters according to their similarity,and the data objects in the same cluster are similar to each other,while the data objects in different clusters are different from each other.Cluster analysis is widely used in image processing,artificial intelligence,medicine,aerospace,etc.The algorithm based on density representation points in clustering algorithm has achieved good results,but when dealing with data sets with complex shapes,the method based on a single density representative point cannot effectively give the structure information of data sets,thus affecting the clustering effect.In this thesis,the concept of density core is introduced in the clustering analysis,the density core solves the problem that the algorithm based on single representative points cannot handle data sets with complex shapes.By analyzing the ideas and basic theories of existing clustering algorithms and improving their existing problems,we proposes a new clustering algorithm.The main work and achievements include the following aspects:(1)To solve the above-mentioned problem that the clustering algorithm is not suitable for data sets with multi-density and complex shape,this thesis proposes a method to extract the density core by using the reverse neighbor number.This method uses data points whose the number of reverse neighbors greater than the natural energy eigenvalues as the density core of the data set.This method solves the problem that existing clustering algorithms cannot deal with the datasets with multi-density hierarchy and complex shape.(2)Aiming at the problem of too many parameters and sensitivity to parameters in traditional clustering algorithms,this thesis introduces the natural neighbor algorithm.The natural neighbor algorithm considers the distribution of the data itself,and the algorithm can calculate the natural characteristics of the data set through adaptive iteration,and this process does not need to manually input any values,which solves the problem of parameter sensitivity.(3)This thesis proposes a minimum spanning tree clustering algorithm based on density core(MSTDC).MSTDC algorithm first calculates the reverse neighbors' information of the data objects using the natural neighbor algorithm,then uses the data objects whose reverse neighbors are greater than the natural eigenvalues as the density core,and finally uses the minimum spanning tree algorithm to cluster the density cores.The algorithm does not need to set any parameters,and can well adapt to data sets with multi-density and complex shape.(4)In this thesis,we use twelve artificial data sets and ten real UCI data sets to verify the effectiveness of the MSTDC algorithm,and compare the algorithm with K-Means,DBSCAN,OPTICS,DPC,SNNDPC and DCore.The experimental results show that the MSTDC algorithm performs optimally on most data sets,and the evaluation index is better than other algorithms.
Keywords/Search Tags:Clustering, Density Core, Reverse-nearest Neighbor, Minimum Spanning Tree, MSTDC
PDF Full Text Request
Related items