GPU-based mapreduce schemes for big data processing

Posted on:2014-06-01

Degree:M.S

Type:Thesis

University:Arkansas State University

Candidate:Chen, Yi

Full Text:PDF

GTID:2458390008453832

Subject:Computer Science

Abstract/Summary:

MapReduce programming model and its implementations have simplified many par-allel applications. Because of the raising demand of higher computing performance, Graphics Processing Units (GPU) has been used to accelerate MapReduce in several stud-ies. Different from CPU, high GPU utilization requires not only descent parallel algo-rithm but also careful considerations of hardware details. This paper describes the devel-opment path of our MapReduce system from single GPU to multiple GPUs. Utilization of each GPU is promoted by using new GPU features such as streams and Hyper-Q. Fur-thermore, several scheduling schemes are designed to avoid blocked GPU operations. To address the challenge of Big Data, our MapReduce system handles large data sets that ex-ceed GPU and even CPU memory. Experimental results show the performance im-provement and increased scalability gained from each acceleration technique. Although our current work is specific to MapReduce, many underlying ideas are also applicable to acceleration of other GPU applications.

Keywords/Search Tags:

GPU, Mapreduce, Data

Related items

1	Design And Implementation Of A Data Integration System Based On MapReduce
2	Research On Distributed Fast Clustering Algorithm Based On Mapreduce
3	Research And Strategy On Data Skew Problem Based On MapReduce
4	Research On Improved Association-rules Algorithm Based On Mapreduce
5	Research And Implementation Of Local Priority Scheduling Algorithm Based On Mapreduce For Massive Data
6	College Big Data Analysis And Mining Based On MapReduce
7	The Research Of Data Cleaning Algorithm Base On MapReduce
8	Research On The Clustering Algorithm Of Parallel Partition Based On MapReduce
9	Data Processing Of Complex Structured Data Based On MapReduce
10	The Research Of Handling Data Skew In MapReduce Computing Model