Font Size: a A A

A Fast And Efficient Parallel Bisecting K-Means Algorithm

Posted on:2014-09-14Degree:MasterType:Thesis
Country:ChinaCandidate:D Y JiangFull Text:PDF
GTID:2268330425966519Subject:Signal and Information Processing
Abstract/Summary:PDF Full Text Request
The data representing ownership is a patterns hidden that hard to find behind thecomplex relationship among data. At present, there have been many kinds of clusteringanalysis methods are used in data mining to analyze this model, these methods have theiradvantages and disadvantages, some algorithms have been put into practice. The K-means isone kind of simple algorithm, but this algorithm has many disadvantages. K means algorithmrequires a number of K by the user to determine, but also has great uncertainty in the initialcluster center, which leads to the method is unstable and easily falling into local optimalsolution, not the global optimal solution.Image segmentation is the basis of visual perception, but because the image structure andcontents are different, in order to achieve rapid general image segmentation is still a difficultproblem. In the absence of a prior knowledge, image segmentation can be completed throughcluster analysis. Image segmentation method based on clustering analysis constraints on thesample space is small; the segmentation algorithm has good universality. Both the gray image,color image and texture image, can apply clustering analysis method to complete thesegmentation. But the segmentation method based on clustering analysis is not perfect, mainlybecause of the large amount of calculation in clustering analysis, extreme value problem andsample noise sensitive.This paper mainly studies in the K-Means method, using K-Means method owns thischaracter that bisects the data stable, designs and implement parallel bisecting K-Meansalgorithm. The algorithm called K-Means segmentation of data according to cell division, tobuild a full binary tree, when the leaf nodes over data categories on the leaf nodes are part ofthe merged, and then get the final clustering result in the process, and the parallel bisectingK-Means method is applied to image segmentation. The main research work of this paperincludes the following aspects:(1) By using parallel bisecting K-Means method and bisecting K-Means method hassome comparative tests, simulation proved that the parallel bisecting K-Means methodcompared to bisecting K-Means method has lower time complexity and better clusteringeffect. The KM method for the large scale data and its time complexity and scale of data intoa linear relationship, this is the advantage of K-Means algorithm for large-scale data processing. The parallel bisecting K-Means method preserves the advantages of K-Meansmethod for large-scale data processing and its time complexity is lower than. Theexperimental results show that the parallel bisecting K-Means method algorithm has higherefficiency than the K-Means, the bisecting K-Means method, more suitable for large-scaledata processing.(2) To validate the parallel bisecting K-Means algorithm for handling capacity of fielddata, the parallel bisecting K-Means algorithm is used into image segmentation, and thetraditional K-Means algorithm is compared, the experimental results show that: in the imagesegmentation effect under the similar conditions, the parallel bisecting K-Means algorithm isbetter than the K-Means algorithm processing time is reduced by25%.Experiments show that, the proposed algorithm is fast and effective and has better imagesegmentation results. At the same time, less time consuming than the general algorithm.
Keywords/Search Tags:Data mining, Clustering, Parallel bisecting K means, Image processing, Imagesegmentation
PDF Full Text Request
Related items