Font Size: a A A

The distribution of opencl kernel execution across multiple devices

Posted on:2015-01-08Degree:M.A.SType:Thesis
University:University of Toronto (Canada)Candidate:Gurfinkel, StevenFull Text:PDF
GTID:2478390017490532Subject:Engineering
Abstract/Summary:PDF Full Text Request
Many computer systems now include both CPUs and programmable GPUs. OpenCL, a new programming framework, can program individual CPUs or GPUs; however, distributing a problem across multiple devices is more difficult. This thesis contributes three OpenCL runtimes that automatically distribute a problem across multiple devices: DualCL and m2sOpenCL, which distribute tasks across a single system's CPU and GPU, and DistCL, which distributes tasks across a cluster's GPUs. DualCL and DistCL run on existing hardware, m2sOpenCL runs in simulation. On a system with a discrete GPU and a system with integrated CPU and GPU devices, running programs from the Rodinia benchmark suite, DualCL improves performance over a single device, when host memory is used. Running similar benchmarks, m2sOpenCL shows that reducing the overheads present in current systems improves performance. DistCL accelerates unmodified compute intense OpenCL kernels, obtained from Rodinia, AMD samples and elsewhere, when distributing them across a cluster.
Keywords/Search Tags:Opencl, Across, GPU, Devices
PDF Full Text Request
Related items