Font Size: a A A

Optimization Of High Performance MapReduce System

Posted on:2011-05-30Degree:MasterType:Thesis
Country:ChinaCandidate:X Q WangFull Text:PDF
GTID:2178360308955371Subject:Computer software and theory
Abstract/Summary:PDF Full Text Request
In recent years, Multi-core clusters as mainstream high-performance computer architecture are becoming the most important hardware supporting platform of HPC.The shared variables and message passing is the most commonly used two types of parallel programming model. However, their levels are too lower for describing and expressing parallel by explicitly using multithread or multiprocess. Developers must spend a lot of time and energy to get familiar with them and use their synchronization and communication primitives explicitly to cooridinating relationships of all parallel tasks.MapReduce parallel model is a highly abstract, the preparation of serial program automatically running in parallel, simple programming interface, easy parallel programming model. It allows programmers at a higher level of abstraction in a more understandable to express parallel computation problem. HPMR is designed to promote the MapReduce model for high-performance computing in my laboratory,w hich supports large-scale computing task assignment and automatic parallel.Performance of currently HPMR system has gap with direct use of MPI Programming yet. In order to make it more practical, combination of several optimization techniques, system optimization is done. Main work is as follows:(1) Based on characteristics of communication process of HPMR program ,bra- nch prediction and speculative execution technology is applied to its communication optimization. A new communication model is designed.The notable feature of communication in HPMR program is that each round KV transfer adopts the KV table of last round with high probability, therefore KV transfer could be carried out speculatively according to the KV table of last round and thus KV routing is not necessary each round.So speculation execution technology is introduced into the communication model of HPMR, which reduces the number of KV routing and greatly enhances communication performance of HPMR.(2) Against the disadvantages of current HPMR memory management, a highly efficient memory management mechanism is offered. The features of HPMR memory management is redundant and inefficient, frequent memory copies. So a new memory management is preposed based on memory pool mechanism. (3) Based on efficient implementation of the collective communication, the optimization of identification of collective communication on the routing table is developed.In the implementation of HPMR communication system, according to the direction of the routing table, KV data transmission currently can only be achieved through point to point communication. If collective communication such as broadcast communication or scatter communication is more suitable, then point to point communications is inefficient step by step. So routing table can be optimized to support collective communication, to avoid inefficient simulation through point to point communication.This research goal: to use some optimization techniques to improve system performance of HPMR,to make it more practical.
Keywords/Search Tags:high-performance mapreduce, speculation execution, memory pool, identification of collective communication
PDF Full Text Request
Related items