Font Size: a A A

Research On K-Means Algorithm Based On MapReduce

Posted on:2017-01-26Degree:MasterType:Thesis
Country:ChinaCandidate:C WuFull Text:PDF
GTID:2428330548980958Subject:Software engineering
Abstract/Summary:PDF Full Text Request
Because of its simplicity and practicality,K-Means algorithm is applied in the field of business and academia,but it has a large impact on the initial center point,the clustering effect is not ideal and the processing of large data,it is easy to happen the memory overflow so that the algorithm can not run.Although the Hadoop algorithm is used to solve this problem,it is still unable to solve the problem of clustering effect in large data environment,which is not stable and the accuracy rate is lower.In this dissertation,based on the MapReduce framework and random sampling,a random sampling K-Means algorithm based on MapReduce is realized by using the K-Means framework and random sampling.Through the comparison experiment,it shows that the improved algorithm has a better speedup in the cluster,and the test results show that it has a better clustering effect than the test results with multiple accuracy.
Keywords/Search Tags:K-Means, Distributed computation, MapReduce, Random sampling
PDF Full Text Request
Related items