Font Size: a A A

A Hierarchical Clustering Method Of Cluster Confirm

Posted on:2011-02-26Degree:MasterType:Thesis
Country:ChinaCandidate:L L NiuFull Text:PDF
GTID:2208330332957975Subject:Computer software and theory
Abstract/Summary:PDF Full Text Request
Cluster validity index assesses the quality of the clusters derived from clustering, so that a better clustering result which reflects the original structure of data can be obtained. By giving the joint probability distribution p(x,y) between a source variable X and its relevant variables Y, the Information Bottleneck method can seek a compression variable T of X and effectively find patterns hidden in data. The method has rigorous theory and has applied in many fields. In this paper, the IB method will be used in the hierarchical clustering validation.Many cluster algorithms need input the number of the cluster or the range of clusters, it is an important difficult problem in cluster analysis. To solve the problem of determining the correct number of clusters, this paper proposes a new cluster validity index, IB_Hindex, for hierarchical clustering based on IB method. The index introduces the function of cluster density distribution, effectively incorporates the cluster cohesion and separation so that the corresponding algorithm is able to find the number of feature patterns hidden in data. The algorithm doesn't depend on any parameters, effectively search in a hierarchical binary tree, so that it can find an appropriate line cutting the dendrogram and get the optimal partition. Experimental results show the IB Hindex can find the correct cluster number hidden in data. Especially when the data consists of different size and density cluster, IB_Hindex performs much better than other indices, which provides a good theoretical support for the application of hierarchical clustering.
Keywords/Search Tags:Clustering, Cluster Validity Index, Dendrogram, IB Method
PDF Full Text Request
Related items