Font Size: a A A

A Deterministic And Scalable MapReduce For Multicore Systems

Posted on:2016-03-07Degree:MasterType:Thesis
Country:ChinaCandidate:H F CaoFull Text:PDF
GTID:2308330467494922Subject:Computer software and theory
Abstract/Summary:PDF Full Text Request
MapReduce is a simple and efficient data-parallel programming model, which is initially targeted for large-scale clusters.With the prevalence of multi-core machines, designing a MapReduce system to fully utilize multi-core resources is becoming increasingly important. Existing MapReduce libraries, such as Phoenix, are often implemented with Pthreads to spawn parallel map or reduce tasks, and rely on shared memory to handle inter-task communications. Due to the unpredictable interleavings of threads, these libraries would generate nondeterministic or even error-prone results when the reduce function is non-commutative.This paper aims to design and implement a deterministic MapReduce library for multicore, which not only ensures deterministic parallel execution for MapReduce applications, but also has good performance. The main contributions of this paper are listed as follows:(1) We propose DMR, a deterministic MapReduce library, which provides Phoenix-compatible API and is built atop a deterministic message passing multithreaded programming library (DetMP). To ensure applications execute determmistically, it assigns tasks to map and reduce threads statically and utilizes the memory isolation mechanism provided by DetMP to allow threads to communicate only via deterministic channels explicitly. To improve the performance of DMR, we adopt several optimizations methods, such as application-oriented scheduler and so on.(2) We summarize the programmability of DetMP based on the development of DMR and the experience of rewriting dedup with DetMP. DetMP is suitable for pipeline parallelism applications. DetMP provides programmers with simple APIs, which frees programmers from managing the store of dataflow and details of concurrent access, thus simplifying parallel programming. But it cannot support several sharing patterns of heap variables among threads which are shown in some workloads of Phoenix benchmark suites, and DMR sloves this by extending the APIs of Phoenix.(3) We conduct the evaluation of DMR on a32-core machine with Phoenix benchmarks suites, and we further analyze the impact on performance of different implementation mechanisms. Evaluation results show that DMR only runs worse than Phoenix on kmeans, and outperforms or matches the performance of Phoenix on the other six workloads. DetMP currently is implemented on a deterministic virtual memory model SPMC. To further analyze the impact on performance of virtual memory model, we implement DetMP with Pthreads shared-memory library, and compare the performance of the two implementations with the Phoenix workloads and a pipeline parallelism program (dedup). Evaluation results show that all applications except for matrix-multiply and string-match have better performance on DetMP implemented atop SPMC at16and32cores, obtaining up to9.5times faster; and the performance of the two implementations is comparable for matrix-multiply and string-match or other applications when the number of cores is less than16.
Keywords/Search Tags:MapReduce, deterministic parallelism, multicore, performance, scalability
PDF Full Text Request
Related items