Scheduling And Optimizing A Big Data Computing Framework Based On CPU/GPU Cluster

Posted on:2016-06-26

Degree:Master

Type:Thesis

Country:China

Candidate:MBARUSHIMANA Emmanuel A M

Full Text:PDF

GTID:2308330476955005

Subject:Computer Science and Technology

Abstract/Summary:

PDF Full Text Request

?cant amount of interest as an important technology to reveal the information behind the data, such as trends, characteristics, etc. and recently many researchers have contributed in different ways in order to provide the solutions in terms of processing big data,MapReduce is one of the most popular distributed parallel data processing frameworks. However, some high-end applications, especially some scienti?c analyses have both data-intensive and computation-intensive features. Therefore, we have designed and implemented a high performance big data process framework called Lit, which leverages the power of Hadoop and GPUs.In this thesis, we presented the basic design and architecture of Lit. More importantly, we have spent a lot of effort on optimizing the communications between CPU and GPU and presented the strategies used for scheduling data movement. My approach Inspired in part by codes optimizations in the scienti?c computing community; and we proposed instruction fusions. Instruction fusion fuses the code bodies of two GPU instructions in order to i) eliminate jobless operations across dependent instructions, ii) reduce data movement between GPU registers and GPU memory, iii) reduce data movement between GPU memory and CPU memory, and iv) improve spatial and temporal locality of memory references.We have also introduced data ?ow optimization approach to reduce unnecessary memory copies and finally we introduced the data communication schedule. Our approach can signi?cantly reduce the redundant data communication in most cases...

Keywords/Search Tags:

MapReduce, Hadoop, Lit, Optimization, Scheduling, Fusion

PDF Full Text Request

Related items

1	The Mapreduce Model In The Hadoop Implementation Of Performance Analysis And Optimization Improvements
2	Research On Hadoop Cluster Scheduling Optimization
3	Research On Optimization And Improvement Of MapReduce Job Scheduling Algorithm
4	Research On Scheduling Algroithm In Hadoop Mapreduce
5	Research On MapReduce Model For Fusion Architecture And Accelerated Strategy For Hadoop
6	Research On MapReduce Performance Optimization Based On Hadoop
7	Research And Improvement Of Job Scheduling Algorithm Based On Hadoop
8	Design Of Mapreduce Task Scheduling Algorithms In Heterogeneous Hadoop Cluster
9	The Research Of MapReduce Job Scheduling Algorithm Based On The Hadoop Platform
10	Research And Implement Of Job Scheduling Method For Multi_user Mapreduce Clusters