Research Of K-means Clustering Algorithm

Posted on:2014-12-07

Degree:Master

Type:Thesis

Country:China

Candidate:R R Han

Full Text:PDF

GTID:2298330452462702

Subject:Computer Science and Technology

Abstract/Summary:

PDF Full Text Request

Clustering analysis is an important data mining technology, the classification of the dataset used to find a relationship. Clustering is different from the classification, it does not relyon any prior knowledge, through their own data, the characteristics of the data into differentclasses. Clustering method has a lot of kinds, which is based on the division of the method issimple and effective, is one of the most commonly used method. Main methods are K-means,K-medoids methods and their deformation.Classic K-means clustering algorithm is a kind of hard partition method, it will be thedata centralized data object strict division to a class, the algorithm of solving process is aniterative process, when the algorithm iteration terminates when meet the predeterminedconvergence condition, output the final clustering results. Its algorithm is simple and easy tounderstand, and the operation efficiency is very high, people usually use the algorithm to dealwith large data sets. But K means clustering algorithm has some defects, people made manyimprovements on its defects, which will be introduced to the fuzzy theory and the theory ofrough K-means algorithm, carries on the improved ideal effect. Main methods are FKM,HKM method, RFKM methods and their deformation.Paper first introduces the clustering analysis of some basic concepts and some techniques,such as clustering analysis in data structures, data types, the criterion function, etc., andemphatically introduces the classic K-means algorithm, such as ideology, the steps of thealgorithm, and the advantages and disadvantages of the algorithm is analyzed and discussed indetail.Paper introduced the fuzzy set theory and rough set theory, and their effect on theclassical optimization algorithm of K-means algorithm. First introduced the fuzzy K-meansalgorithm, and analyses its advantages and disadvantages, pointed out that the algorithm is arelatively obvious shortcoming: the data set for each data object for all classes of membershipdegree is1, the sum of the constraint condition is too strict, if when there is a noise point dataset algorithm will be affected by noise points is larger. Then, the paper introduces theclustering algorithm based on rough set, rough set theory is introduced after the improvedalgorithm not only in the clustering results and efficiency are greatly improved, based on theadvantages of paper to fuzzy K means clustering algorithm and the rough fuzzy K-means algorithm was improved.Introduces a new way of measurement-AM metrics, and this measurement method isintroduced into the fuzzy K-means algorithm and the rough fuzzy K-means algorithm, toimprove the algorithm. Membership normalized constraint conditions of algorithm, in thispaper, the algorithm of membership degree constraint conditions have been dealt with ease,the improved algorithm are obtained. Through experimental analysis show that the algorithmafter replacement of metrics and to broaden the membership after the modification of theconstraint condition, the results not only can improve the clustering accuracy and shorten theoperation time.

Keywords/Search Tags:

K-means clustering algorithm, Rough fuzzy clustering method, The fuzzymembership degree

PDF Full Text Request

Related items

1	The Application Of Fuzzy C-means Clustering In The Stock Investment
2	Fuzzy C-means And K-means Clustering Algorithm And Its Parallel
3	Research And Application Of Remote Sensing Image Clustering Based On The Improved Fuzzy C-means Algorithm
4	Probabilistic K-means Models Via Nonlinear Programming
5	Improved Fuzzy C Means Clustering Algorithm And Its Application
6	Research On The Segmentation Method Based On Fuzzy Clustering
7	Research On Clustering And Classification Algorithm Based On Rough Set And Inclusion Degree
8	Research On Risk Degree-Based Safe Semi-Supervised Fuzzy Clustering Algorithm
9	The Improvement On The Fuzzy C-means Algorithm
10	Study Of Auto-Adaption Fuzzy C-Means Clustering Algorithm