Font Size: a A A

Scheduling And Optimizing A Big Data Computing Framework Based On CPU/GPU Cluster

Posted on:2016-06-26Degree:MasterType:Thesis
Country:ChinaCandidate:MBARUSHIMANA Emmanuel A MFull Text:PDF
GTID:2308330476955005Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
?cant amount of interest as an important technology to reveal the information behind the data, such as trends, characteristics, etc. and recently many researchers have contributed in different ways in order to provide the solutions in terms of processing big data,MapReduce is one of the most popular distributed parallel data processing frameworks. However, some high-end applications, especially some scienti?c analyses have both data-intensive and computation-intensive features. Therefore, we have designed and implemented a high performance big data process framework called Lit, which leverages the power of Hadoop and GPUs.In this thesis, we presented the basic design and architecture of Lit. More importantly, we have spent a lot of effort on optimizing the communications between CPU and GPU and presented the strategies used for scheduling data movement. My approach Inspired in part by codes optimizations in the scienti?c computing community; and we proposed instruction fusions. Instruction fusion fuses the code bodies of two GPU instructions in order to i) eliminate jobless operations across dependent instructions, ii) reduce data movement between GPU registers and GPU memory, iii) reduce data movement between GPU memory and CPU memory, and iv) improve spatial and temporal locality of memory references.We have also introduced data ?ow optimization approach to reduce unnecessary memory copies and finally we introduced the data communication schedule. Our approach can signi?cantly reduce the redundant data communication in most cases...
Keywords/Search Tags:MapReduce, Hadoop, Lit, Optimization, Scheduling, Fusion
PDF Full Text Request
Related items