Font Size: a A A

Research On MapReduce Model For Fusion Architecture And Accelerated Strategy For Hadoop

Posted on:2017-04-26Degree:MasterType:Thesis
Country:ChinaCandidate:X ChenFull Text:PDF
GTID:2428330488479844Subject:Information and Communication Engineering
Abstract/Summary:PDF Full Text Request
Parallel programming of heterogeneous architecture,especially represented by CPU-GPU heterogeneous computing architecture,which needs vendors'OpenCL/CUDA programming standards utilize the power of GPU.It brings much diffculties for the common developers by complexity of GPU's internal underlying structure.MapReduce programming model has been successfully applicated in CPUs/GPU devices,provides developers with abstract programmable interface.In this paper,based on existing MapReduce programming model of CPUs or GPU,propose a new MapReduce programming model for a new heterogeneous architecture——Fusion CPU-GPU,and provide a uniform programming API for prgorammer(FGMR).The contributions are summarized as follows:Firstly,this paper analyzes variety of heterogeneos programming model based on MapReduce,the previous model are based on separated CPU-GPU architecture,using global atomic operation brings serious delay within GPU.We propose a programming model named FGMR based on this new heterogeneous architecture,in order to remove the affect by using global atomic operations,design multiple hash tables to slove the problem of parallel write among threads.We also analyzes static task scheduled strategy and dynamic task scheduled strategy,and finally adopt the dynamic task scheduled strategy to fully lever both CPU and GPU,quantitatively analysize the effects using different sizes of task.Four workloads are respectively implemted on Mars?MapCG and FGMR.Experimental results demonstrate that our system has better performance than others.Secondly,analyzed varitey of methods which used heterogeneous architecture to accelerate the cpu-based Hadoop system.Extended the single node model to distributed system,utilized multi-level parallelly speed up the processing of data.The paper shows the framework of Fusion-based Hadoop in the case of multiple nodes,and vertify the efficency of this Fusion-based Hadoop model by using Kmeans alogorithm as the workloads in different datasets.The resluts show the Fusion-based Hadoop exceed the original cpu-based Hadoop in case of time,and convine the FGMR model has good transportability.
Keywords/Search Tags:Fusion CPU-GPU, MapReduce, Hadoop, Kmeans, Accelerate
PDF Full Text Request
Related items