Research On Clustering Algorithm Based On Minimum Spanning Tree

Posted on:2014-08-04

Degree:Master

Type:Thesis

Country:China

Candidate:X Chen

Full Text:PDF

GTID:2268330392972411

Subject:Computer software and theory

Abstract/Summary:

PDF Full Text Request

Data mining is one of the most advanced research field in database technology andinformation decision, and draws nutrition from multiple disciplines. Because clusteringanalysis was an important research direction in data mining, so the further study forclustering analysis had important value in theory and practical application. Thedevelopment of clustering analysis methods had several decades, and researchers trieddifferent approaches to solve clustering problems during this period. Among manymethods proposed, minimum spanning tree clustering algorithm was a typical method.But this kind of existing methods still existed deficiencies, so there was a need toimprove existing clustering analysis technology by proposing a new clustering theoryand method.Firstly, this paper proposed a new minimum spanning tree algorithm calledQMST(Quick Minimum Spanning Tree) on the basis of Prim and Kruskal. QMSTdivided data set into three parts including upperã€lower and temporary data set, andreduced the data scale required to be processed greatly, so it could generate a minimumspanning tree quickly, and the required time was better than Prim and Kruskal. When itdealt with high dimensional data set, it introduced a dimensionality reduction strategyaccording to attribute difference principle, so it could still maintain rapid characteristicsin process of generating minimum spanning tree for high dimensional data set. Secondly,this paper proposed a quick self-adaptive clustering analysis method based on minimumspanning tree QSAK-MST(Quick Self-adaptive K-meansâ€“Minimum Spanning Tree)by analyzing existing minimum spanning tree clustering algorithms, this algorithmcombined with classification and density clustering thought. This method couldgenerate minimum spanning tree of data set quickly by QMST, it didnâ€™t depend onparameter selection, but compared compactness degree of clusters and divided spanningtrees iteratively by using splitCriterion, then generated the initial clustering centers andthe initial clusters. Then it used the core idea of K-means clustering analysis methodand the initial clustering centersã€the initial clusters from the before stage, took theminimal square error function. Each object was assigned to the most similar cluster bymean value of objects in clusters. It updated the mean value of clusters, and repeatedthis process until the function converged or it reached the maximum iteration number.This local adjustment process could effectively alleviate deviation when it divided minimum spanning tree and reduced dimensionality of high dimensional data set, inorder to improve the accuracy of clustering results. QSAK-MST solved the problems ofsetting parameters beforehand and spending too much time on generating minimumspanning tree, it increased clustering speed and improved clustering accuracy. At thesame time, this method took a dimension reduction strategy, so it was still effective indealing with high dimensional data set, and it controlled the clustering error caused bydimension reduction, maintained clustering accuracy within an acceptable range. At last,QSAK-MST and two typical methods based on minimum spanning tree are analyzedand compared by using several real data sets, the experimental results show that thismethod is better in clustering speedã€clustering stability and clustering accuracy andverify effectiveness of QSAK-MST.

Keywords/Search Tags:

Minimum Spanning Tree, Initial Clustering Centers, Compactness Degree, Split Criterion, Dimension Reduction

PDF Full Text Request

Related items

1	Research On Clustering Algorithm Based On Minimum Spanning Tree
2	Research And Application Of Web Chinese Text Clustering Algorithm Based On Minimum Spanning Tree
3	Research Of Clustering Algorithms Based On Minimum Spanning Tree
4	Research On Minimum Spanning Tree Algorithm For Degree Constraints Based On Multi - Populations
5	Research On Heuristic Algorithms For The Degree Constrained Minimum Spanning Tree Problem
6	Degree-constrained Minimum Spanning Tree Algorithms
7	Research On Algorithms For Solving Two Minimum Spanning Tree Variant Problems
8	WCJ Genetic Algorithm For Degree-Constrained Minimum Spanning Tree
9	Wcj Genetic Algorithm For Degree-constrained Minimum Spanning Tree
10	Research On The Degree Constrained Minimum Spanning Tree Problem Based On Partheno-Genetic Algorithm