Font Size: a A A

The Design And Implementation Of A MapReduce Computing Framework Based On GPU Cluster

Posted on:2014-01-02Degree:MasterType:Thesis
Country:ChinaCandidate:G WangFull Text:PDF
GTID:2308330482951981Subject:Computer software and theory
Abstract/Summary:PDF Full Text Request
With the development of modern computer technology, the size of data obtained is growing bigger and bigger. How to process the large-scale data rapidly becomes an urgent problem. MapReduce, as a distributed computing model, can simplify the work to develop parallel applications. Therefore, after being advanced in 2004, it is widely used in the world quickly. However, the existing MapReduce system usually develops CPU clusters; single node has limited computing ability, and there is relatively high cost for upfront investment and maintenance. In recent years, with the enhancement of GPU performance and programmability, it becomes a new breakthrough to enhance the large-scale data processing through GPU-accelerated processing. It becomes a new hotspot for large-scale data processing, as for how to realize MapReduce framework in heterogeneous environment of CPU and GPU. This paper makes further research on this, which include:1. This paper presents a MapReduce prototype based on CPU/GPU heterogeneous system. The majority of computing tasks are accelerated after transferred to GPU, and single node processing is improved, while retaining old MapReduce workflow.2. This paper puts forward and designs MapReduce data schedule system, after analyzing data schedule of entire system. This system accelerates data exchange between glusterFS and CPU, CPU and GPU, GPU and GPU.3. Computing ability of node in cluster may be different due to heterogeneous environment. On basis of MapReduce’s load balancing algorithm, this paper presents and achieves load balancing of task scheduling, which can be applied to GPU-based MapReduce.4. This paper implements and achieves a MapReduce parallel computing framework (NESTOR) based on CPU/GPU heterogeneous system, which is verified in typical geological application PKTM. Experiments show that NESTOR can complement basic workflow of MapReduce, simplify the process to develop GPU programs, and accelerate applications with large scale computing.
Keywords/Search Tags:GPU Cluster, MapReduce, Load Balancing, Data Scheduling, PKTM
PDF Full Text Request
Related items