Improving performance in Hadoop MapReduce

Posted on:2015-09-24

Degree:M.S

Type:Thesis

University:Oklahoma State University

Candidate:Aina, Ademola Chukwudi

Full Text:PDF

GTID:2478390017997691

Subject:Computer Science

Abstract/Summary:

Hadoop MapReduce is a parallel, distributed programming model for processing large data sets or so-called Big data, on a cluster. The basic idea of MapReduce is to split the large input data set into many small pieces and assign these pieces to different devices for processing [5]. In this thesis, we took a look at performance evaluation of the MapReduce framework. MapReduce can be improved to perform speculative execution with maximum performance. Thus, optimizing the cost of computation and cost of communication will help achieve better performance. These optimizations are done by measuring the processing power of each machine and distributing task based on the capacity of each machine. The second step, measure he communication overheads and distribute tasks in the system for a given job or workload. To this end, we represent the Hadoop MapReduce execution with a functional model, and develop an optimization model for performance improvement in the system. Our experiments show that the proposed developed optimization functional model outperforms the regular functional model of the Hadoop MapReduce system by a factor of 2.

Keywords/Search Tags:

Mapreduce, Hadoop, Model, Performance

Related items

1	The Mapreduce Model In The Hadoop Implementation Of Performance Analysis And Optimization Improvements
2	Improving performance in Hadoop MapReduce
3	The Performance Optimization And Improvement Of MapReduce In Hadoop
4	Application Research Of The Performance Optimization For Map Reduce Model In Hadoop
5	Research On Improving The Fault Tolerance Performance In MapReduce
6	The Research Of Performance Optimization Of Hadoop In Big Data
7	The Research Of Improving Performance Of Hadoop Cluster
8	Research On MapReduce Model For Fusion Architecture And Accelerated Strategy For Hadoop
9	Research On MapReduce Performance Optimization Based On Hadoop
10	MapReduce Performance Research And Optimization Based On Block Aggregation