Font Size: a A A

K-means Algorithm Improvement And Application

Posted on:2015-07-14Degree:MasterType:Thesis
Country:ChinaCandidate:L F WeiFull Text:PDF
GTID:2308330482457034Subject:Computer software and theory
Abstract/Summary:PDF Full Text Request
Clustering is an analytic process composed of similar objectives which are grouped by a set of physical or abstract objectives. It is an important human behavior. It is aimed at the data collection on the similar foundation. Clustering stem from many fields including mathematics, statistics, biology and economics and now has been applied into various fields. All these techniques are used to describe data, evaluate the similarity between different data sources and classify the data sources into separate clusters.K-means algorithm is one of the essential and widely-used clustering algorithms. It is a kind of clustering algorithm based on unsupervised partitioning method. There exist all kinds of shortcomings in the original K-means calculation. For example, in each iteration the distance from the data point to clustering center needs to be calculated, whose calculation is very demanding. When dealing with some data with original classifications, the final cluster emerged cannot satisfy people’s demands because the calculation here belongs to unsupervised calculation method and doesn’t take the effects of the original classification on clustering into consideration. All these defects severely affect the quality and efficiency of clustering analyses.The thesis has made several improvements in the calculation based on K-means calculation method. It presents two improved calculation methods and makes use of the improved method to deal with a series of data sets. Clustering center migration method measures n clustering centers closest to X point in the current litigation and taking use of clustering center migration in the pre and pro litigation, to reduce the calculation thro time of clustering, which can save the time of clustering calculation. And also it elaborates on the distance measurement and time complicity of this method. Applying this calculation method into picture collection clustering can prove to raise the clustering efficiency and relegate the calculation complicity. CK-means mentioned in this thesis can effectively deal with the data sources with initial classification. It can be regarded as a supervised method. The advantage of CK-means lies in that it costs almost the same amount of time as the unsupervised K-means method while it costs less time than the supervised clustering method. And its quality of clustering is better than the supervised clustering method. This method makes use of experiment data to test the result and to prove the advantages of CK-means method.At the end, recount the main job of the thesis, and point the further research direction.
Keywords/Search Tags:Clustering analyze, K-means, Clustering center migration algorithm, CK-means algorithm
PDF Full Text Request
Related items