Font Size: a A A

Performance modeling framework for SLO-driven mapreduce environments

Posted on:2013-11-07Degree:Ph.DType:Thesis
University:University of Illinois at Urbana-ChampaignCandidate:Verma, AbhishekFull Text:PDF
GTID:2458390008975038Subject:Computer Science
Abstract/Summary:
Several companies are increasingly using MapReduce for efficient large scale data processing such as personalized advertising, spam detection, and data mining tasks. There is a growing need among MapReduce users to achieve different Service Level Objectives (SLOs). Often, applications need to complete data processing within a certain time deadline. Alternatively, users are interested in completing a set of jobs as fast as possible. Designing, prototyping, and evaluating new resource allocation and job scheduling algorithms to support these SLOs in MapReduce environments is challenging, labor-intensive, and time-consuming. Hence, accurate and efficient workload management and performance modeling tools are needed.;Our hypothesis is that performance modeling of MapReduce environments through a combination of measurement, simulation, and analytical modeling for enabling different service level objectives is feasible, novel, and useful. To support this hypothesis, we propose an analytical performance model based on key performance characteristics measured from past job executions and build a simulator capable of replaying these job traces. We survey different attempts at performance modeling and its applications, and contrast our work. To demonstrate the usefulness of our techniques, we apply them to achieve service level objectives such as enabling deadline-driven scheduling, optimizing makespan of a set of MapReduce jobs and comparing hardware alternatives.
Keywords/Search Tags:Mapreduce, Performance modeling, Service level objectives
Related items