Font Size: a A A

The Research On Clustering Algorithm Based On Manifold Distance And Bee Colony

Posted on:2017-03-05Degree:MasterType:Thesis
Country:ChinaCandidate:H OuFull Text:PDF
GTID:2348330521950526Subject:Software engineering
Abstract/Summary:PDF Full Text Request
We are living in an age of data today which are being in our life everywhere and growing with the way of explosive that we can't calculation,but how to obtain the potential and useful information to guide people make the right judgement and decision from the massive dataset,we should mining these data.Clustering algorithm is not only an important way of data analysis but also an active research topic which gets the favor of the researchers,but there are still some defects,which will require further improved algorithm to compensate,and to solve the difficulties encountered in the social practice also has very important significance.This paper is based on the traditional clustering algorithm's similarity measure,it discussed the shortage of some special data clustering algorithms that used the Euclidean distance as the similarity measure for the data mining,and discussed the advantage of some special data clustering algorithms that used the manifold distance as the similarity measure,but there are also some insufficient,so we made the further research on granular computing,the theory of rough set,swarm algorithm and so on,and made the improvement also combined with the improved manifold distance for improve the performance of the original algorithm.The main works are as follows:(1)Aiming to resolve the problems of the traditional k-means algorithm random selecting of initial clustering centers,having the flaw of the global consistency on the large scale while parameters which based on manifold distance as the measure of the similarity.This paper brought in the knowledge of granular computing,which partitioning the data based on the attribute,then selected the initial centers by max-min distance.At last using the manifold distance and criterion function to get the best clustering center and clustering.Experimental results show that the algorithm has the good global consistency to the data set and the running time is reduced.(2)Aiming to “absolute manifold” dataset has better performance than the “relative manifold” dataset,this paper due to the clustering characteristics of the rough set,selected the clustering center with attribute partitioning and max-min distance first,then combined with the manifold distance as similarity measure in rough set clustering.Experimental results show that the algorithm can effectively improve the “relative manifold” dataset.(3)To improve the performance of the existing clustering algorithm based on the manifold distance,this paper secondary clustering of data sets that used the improved manifold distance as similarity and combined with the bee colony.Firstly,it based on local density and neighbors selecting initializes dataset,then through the improved colony algorithm for data collection of fine classification.Experimental results show that the dataset has the better improve.
Keywords/Search Tags:clustering algorithm, manifold distance, attribute partitioning, rough set, bee colony algorithm, local density, neighbors selecting
PDF Full Text Request
Related items