Font Size: a A A

Research On Connectivity-based Cluster Validity

Posted on:2011-04-01Degree:MasterType:Thesis
Country:ChinaCandidate:S C ZhangFull Text:PDF
GTID:2178360305960275Subject:Computer application technology
Abstract/Summary:PDF Full Text Request
In the domain of Machine Learning, one of the most important issues is cluster analysis. The aim of cluster analysis is partitioning a given data set into meaningful groups such that the data points in the same group are more similar to each other than points in different groups. Cluster analysis has been widely used in a lot of fields, such as engineering, business and social science. There are so many important but unsolved problems relate to cluster analysis. Instead of paying attention to all of them, we focus on only one isusse, the cluster validity problem.In this paper, we focus on cluster validity problem based on Connectivity. Many validity indices faced two problems. First, most of these indices can't deal well with the clusters of arbitrary shapes. Second, they neglect the variance of intra-cluster compactness. To solve these two problems, we first take advantages of connect distance to solve the first problem. After that, we propose a new idea for defining cluster validity indices. We assume that the validity indices for the whole clustering results should be defined upon the validity indices of single clusters. This idea can help the indices deal with the clustering results which contain clusters of different intra-cluster compactness. After that, we propose a new connectivity-based cluster validity index.The experiment results on synthetic data sets and real data sets show the effectiveness of our validity index.
Keywords/Search Tags:cluster analysis, cluster validity index, connect distance
PDF Full Text Request
Related items