Research And Improvement Of K - Means Clustering Algorithm

Posted on:2016-10-31

Degree:Master

Type:Thesis

Country:China

Candidate:L L Liu

Full Text:PDF

GTID:2208330464463537

Subject:Computer Science and Technology

Abstract/Summary:

PDF Full Text Request

With the rapid development of information technology, life, production and research in various fields today are being digitized for processing information. The extremely large number of text, images, audio, video and other forms of data generated. How to extract the unknown information with hidden potential value from the mass data accurately and efficiently, It is an important issue.The birth of data mining technology has brought many effective methods and tools to solve this problem. As a new interdisciplinary science technology, it contains several popular research directions. Cluster analysis(referred to as "clusters") is one of the most mature and most widely used data mining techniques. Its main function is dividing the data set into a number of different groups based on certain rules, data objects in the same group are as similar as possible,on the other hand, data objects in different groups are as different as possible. Calculating the similarity between data objects is by describing the object’s properties to achieve. At present,clustering analysis has been widely applied in machine learning, pattern recognition, image processing, text classification, marketing, statistical science and lots of others fields.According to the difference of research status and structure of thinking, we can divide existing clustering algorithms into partition algorithm, hierarchical algorithm, grid-based algorithm, density-based algorithm and model-based algorithm. K- means clustering algorithm is an classical algorithm based on partition. This thesis presents deeply research and analysis on merits and defects of k-means clustering algorithm. According to the feature that the results of k-means clustering algorithm liable to be effected by initial centers, this thesis has provided a improvement on k-means clustering algorithm. Following are the main works have been done:(1)Describing the data mining research status, cluster analysis research background and related concepts.(2)Studying the basic ideas and principles of K- means clustering algorithm, presenting deeply research and analysis on merits and defects of K-means clustering algorithm, analyzing and comparing improvements to existing measures K- means clustering algorithm. To get the best number of clusters, an optimization algorithm of K values is proposed. Experimental results show that the algorithm solves the dependency problem of K value successfully.(3)Aiming to the disadvantages of K-means clustering algorithm that it is sensitive to the initial centers selection and easily falls into local optimal solution, differential evolution algorithm whose global optimization ability is strong was introduced into clustering algorithm with crossover and mutation, selection operation to replace the cluster centers continuously updated process. This thesis put forward an improved differential evolution algorithm and madeit combined with K-means clustering algorithm at the same time. Finally, experiments verify the effectiveness and feasibility of improved algorithm.

Keywords/Search Tags:

Data Mining, Cluster Analysis, K-means Clustering Algorithm, Optimal Clustering Number, Differential Evolution Algorithm

PDF Full Text Request

Related items

1	Research Andapplication On Determining Optimal Number Of Clusters In Cluster Analysis
2	Some Problems Of Determining The Optimal Number Of Clusters In Clustering Analysis
3	Research On Improvement Of K-means Clustering Algorithm
4	Research On Determining Optimal Number Of Clusters In Cluster Analysis
5	Research On Determining Optimal Number Of Clusters In Cluster Analysis
6	Research Of Improved K-means Algorithm And New Cluster Validity Index In Cluster Analysis
7	The Research And Application Of Text Clustering Based On Improved K-means Algorithm
8	Improvement Of Differential Evolution Algorithm And Its Application In K-means Clustering Algorithm
9	Improvements And Implementation Of K-means Clustering Algorithm
10	Clustering Algorithm Based On Differential Evolution Algorithm