Font Size: a A A

Research On Clustering Algorithm Based On Minimum Spanning Tree

Posted on:2020-04-15Degree:MasterType:Thesis
Country:ChinaCandidate:X B LvFull Text:PDF
GTID:2438330575977202Subject:Computer application technology
Abstract/Summary:PDF Full Text Request
Clustering method plays an important role in data mining and image processing.However,most clustering methods including partitional-based,hierarchical-based,density-based,and grid-based approaches cannot obtain the satisfactory results due to the diversity and complexity of data.Sufficient empirical evidence has shown that minimum spanning tree(MST)representation is invariant to detailed geometric changes in the boundaries of clusters.Therefore,the shape of cluster has little influence on the performance of MST-based clustering algorithm.Based on this,a lot of research focus on the MST-based clustering algorithm.However,most of these methods are not suitable for data sets with different characteristics or sensitive to noise.Therefore,it is of great significance to propose a clustering algorithm that can adapt to the data sets of multiple shapes.Based on the current research,we mainly concentrate on the following work:(1)The existing initialization algorithm has certain limitations in describing the initial cluster center,which leads to the result that the clustering method can only adapt to the spherical data sets.This paper proposes an initial clustering center selection algorithm based on density geodesic distance.We believe that the clustering centers are characterized by a higher density than their neighbors and by a relatively large distance from points with higher densities.We select the intital cluster centers according to the two characteristics.And the algorithm can adapt to different data set with different shape.(2)For the MST-based clustering algorithm,its performance is less affected by the shape of clusters.However,the definition of the inconsistent edges is a major issue that has to be addressed in all MST-based clustering algorithms.Most MST-based clustering algorithms take the longer edge as the candidate edge of inconsistent edges,which makes the algorithm sensitive to the shape of cluster shape as well as noise.In this paper,the candidate edges of inconsistent edge are limited in the the path between the initial clustering centers,which greatly reduces the searching range of candidate edges.Moreover,the results are insensitive to noise.(3)In most clustering validation indexes,Euclidean distance is used as the similarity measure and the average distance between the pairwise points in two clusters are taken as the between-cluster distance,which leads to the fact that the validation results are limited to the shape of clusters.We propose a novel intercluster separation by computing the distance between the points at the intersection of clusters.Furthermore,we propose a new internal clustering validation measure to select the best clustering result.The experimental results on the synthetic data sets,real data sets,and image data sets demonstrate the good performance of the proposed MST based method.
Keywords/Search Tags:Clustering, Initial clustering center, Geodesic distance, Minimum Spanning Tree, Inconsistent edges, Internal clustering evaluation index
PDF Full Text Request
Related items