Font Size: a A A

The Research On Arbitrary Shape Cluster Algorithm Based On Hierarchy And Density

Posted on:2017-10-17Degree:MasterType:Thesis
Country:ChinaCandidate:L J NiuFull Text:PDF
GTID:2348330536955775Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
Clustering technology as an important research direction of data mining,can effectively help people to understand the data distribution and characteristics,in order to make further research and analysis.Although there are many clustering algorithms have been proposed,but clustering analysis still exist many problems and challenges.Based on the hierarchical clustering algorithm and density clustering algorithm,a novel arbitrary shape clustering algorithm is proposed.In the framework of hierarchical clustering,our algorithm is based on density method to define initial sub-clusters and the clustering criterion function.The main work of this paper is shown as following:(1)According to the hierarchical clustering algorithm in the computational time complexity is higher and need manual input cluster number or the threshold parameter as a condition for the termination of clustering,this paper proposes a novel clustering method based on density.The density of the inter cluster boundary region is larger than that of any one of the clusters.The method of dynamic model can automatically adapt to the internal characteristics of the clusters,which can automatically determine the number of clusters and the end points of the cluster,which can be used to find clusters of arbitrary shape.(2)Aiming at the problem that some density clustering algorithms tend to overlook the density peaks of density sparse area,this paper find out the point having a relatively large distance from other high-density points as a density peak point,eased the density peak point selection criteria.Then use density peak points to divide the dataset into a large number of initial sub-clusters.Experimental illustrate that sub-clusters is quite reasonable.(3)Aiming at the problem that some density clustering algorithms using a global distance parameter is not conducive to large density difference data set.By layering the low density data set and the high density data set,the low density sub clusters are selected,and the appropriate distance parameters are set to solve the problem.Several experiments are performed on test datasets and real datasets,whichdemonstrate that our algorithm can automatically determine the number of clusters,effectively find any shape and size of the cluster,be robust with respect to the choice of input parameter,and have better clustering accuracy for the data sets of the uneven density distribution.
Keywords/Search Tags:hierarchical clustering, density clustering, arbitrary shape of clusters, sub-clusters merge method, density peaks, border region density
PDF Full Text Request
Related items