Font Size: a A A

Research Of Data Allocation Strategy On Mapreduce Model

Posted on:2014-05-19Degree:MasterType:Thesis
Country:ChinaCandidate:J Y YuFull Text:PDF
GTID:2268330422463516Subject:Computer application technology
Abstract/Summary:PDF Full Text Request
Since the birth of Cloud Computing in2007, it has gradually become a popularconcept in the IT industry at home and abroad, and it has received widespread attention.With the highly development of Internet and in the face of a sharp increase of datavolume, how quickly and efficiently storage and computing the mass data volume hasbecome a urgent problem. And it is also the motives of Cloud Computing was proposed.But for Cloud Computing, it is a way of thinking, if we try to really play its strengths,there must have a programming model to support the Cloud Computing thought not onlyhardware facilities. MapReduce parallel programming model is proposed by Google, itprovides software support for Cloud Computing of massive data processing.Hadoop works in a reliable, efficient, and scalable way, and in just a few years, it hasbecome a major open source Cloud Computing platform. But Hadoop is still a youngplatform relatively and it is inadequate at many places, so it is very necessary to improve it.In this paper, we research the MapReduce platform parallel programming model inHadoop, we propose a solution on the imbalance of intermediate data distribution at theoutput of Map. In this paper, based on the sampling of the original data set, we estimatethe distribution of the data set and propose a load balancing allocation strategy called theLAB allocation strategy.This allocation strategy balanced distribution the intermediate data at the output ofMap, It can ensure Load balance of the input of Reduce, make full use of computingresources, avoid the waste of resources and improve the efficiency of a program. Theexperiments show that the improved allocation strategy greatly improves the efficiency ofthe implementation for the MapReduce job.
Keywords/Search Tags:Cloud Computing, Hadoop, MapReduce, Data Allocation
PDF Full Text Request
Related items