Font Size: a A A

K-medoids Cluster Analysis Based On Improved Cuckoo Algorithm And Its Parallel Implementation

Posted on:2019-10-04Degree:MasterType:Thesis
Country:ChinaCandidate:N YangFull Text:PDF
GTID:2428330545482405Subject:Computer technology
Abstract/Summary:PDF Full Text Request
In the past 20 years,with the rapid development of information technology,the amount of data generated in various fields has also become more and more,so the concept of big data has also attracted the attention of all sectors of society.Faced with a large amount of complicated data,people gradually find that it is an important resource for human progress,and it is urgent to find valuable information from these massive data.Data mining usually refers to the process of searching for information hidden in massive data through an algorithm.Through this process,valuable information can be found from a large amount of complicated data.The K-mediods clustering algorithm has the advantages of easy implementation,high efficiency,and is widely accepted by people.As people continue to explore the data mining technology,there have been numerous innovations in the field of data mining.The application of intelligent optimization algorithms to the K-mediods is one of the innovations.As a swarm intelligence optimization method newly emerged in the category of evolutionary computation,the cuckoo search algorithm has cited biological evolution theory,with fewer parameters and more effective random search.This paper first improves the cuckoo search algorithm,and then combines the improved cuckoo search algorithm with the K-mediods to complete the cluster analysis with the specified initial centroid.Finally,the combined algorithm is applied to the parallel experiment analysis under the Map Reduce framework.The specific work is as follows:(1)The concepts of cluster analysis,K-mediods algorithm,cuckoo search algorithm,and Map Reduce framework are briefly described.The ideas,processes,advantages,and disadvantages of the two algorithms are analyzed.(2)Introducing adaptive discovery probability to cuckoo search algorithm.By changing the fixed-value parameter discovery probability in the cuckoo search algorithm to a dynamically changing adaptive discovery probability,the algorithm accelerates the convergence speed in the early period,improves the convergence accuracy in the later period,and more efficiently and quickly searches for the optimal solution.(3)Applying the improved cuckoo search algorithm to the K-mediods algorithm,the K-mediods optimization algorithm based on the improved cuckoo search algorithm is obtained.This algorithm utilizes the advantages of self-adaptive discovery probability-based cuckoo search algorithm and K-mediods algorithm in the data set,which makes up for the defect that the original K-mediods algorithm is sensitive to the initial centroid selection and can easily fall into local optimum.This algorithm improves the performance of the algorithm.(4)The K-mediods optimization algorithm based on the adaptive discovery probability cuckoo search algorithm is applied to the big data platform to perform parallel experiments,which shows that the algorithm has a superior application perspective in big data cluster analysis.Experiments on the test function and dataset by the algorithm show that the improved cuckoo algorithm has faster convergence speed and better search results.The combination algorithm of the cuckoo algorithm and the K-mediods has higher clustering quality and accuracy,and the convergence speed has also been improved.Finally,the parallel experiment also further proves the high performance of the algorithm.
Keywords/Search Tags:Data Mining, Clustering Analysis, K-mediods, Cuckoo Search Algorithm, MapReduce
PDF Full Text Request
Related items