Font Size: a A A

Research On Density Cluster Centers Constrained Hierarchical Clustering

Posted on:2017-01-09Degree:MasterType:Thesis
Country:ChinaCandidate:X LiFull Text:PDF
GTID:2428330488979912Subject:Computer technology
Abstract/Summary:PDF Full Text Request
Clusters analysis refer to the process,for the given data set,using clustering algorithm to segment and acquiring the subset of it in the situation when the priori knowledge is not given.These subsets acquiring in clusters analysis are called cluster,and for each acquiring clusters,the data in a cluster,they have large similarity,and the data between clusters,they have small similarity.Because clusters analysis don't need any other priori knowledge,and just need data themselves,they are used more widely than those classification algorithms requiring priori knowledge.Although there are some researchers in field proposed some kinds of clustering algorithms,those algorithms still have some problems such as can't detect clusters of arbitrary shape,need too many parameters,and have little effect on data sets with little data.Aming at these problems,this paper proposed improved clustering algorithm based on these algorithms,and the primary research work of this paper is as follows:First of all,proposing centers comfirmed by density of data constrained nearest-neighbor hierarchical clustering algorithm.Algorithm is composed of two phases,the first phase is called pre-merging phase,through the pre-merging of data,using redundant information to compute the density based on it,due to without extra parameter input,this density doesn't have problem of sensity of parameter initialization.Meanwhile,for computation of single datum density during computing density,the mothod not just concentrate on the local information of it,so it won't have the statistical error cuased by this for the data sets with a few data.After acquiring data density,it can compuete minimal distance of data,and it can find cluster centers in data set by these two parameters of data.The second phase is called cneters constrained nearest-neighbor hierarchical clustering,using cluster centers found in first phase,it make difference between clusters with centers and clusters without centers during the merging process,and apply the cluster centers constrained hierarchical clustering to get final result of cluster analysis.Secondly,proposing cluster centers based on local densiy constrained nearest-neighbor hierarchical clustering algorithm.Aiming at the problem of data densiy based on merging redundant information robustness,we used a novel density measurement based on Gaussian function,and proved its robustness to find cluster centers in experiment.Using it in first phase of clustering,it can find cluster centers more robust,and improve the accuracy of final result of cluster analysis in second phase.Finally,to verify the efficiency of the proposed method in paper,we use synthetic data sets and real-world data sets to conduct comparative experiments with other methods respectively.The results demonstate that with little parameter,both methods proposed in this paper can finish cluster analysis efficiently and have advantage on classification rate compared to other methods.
Keywords/Search Tags:Cluster analysis, Clustering aglorithm, Hierarchical Clustering algorithm, Clustering algorithm based on density, Centers contrained clustering algorithm
PDF Full Text Request
Related items