Font Size: a A A

GPU and FPGA Coprocessors for Data Intensive Computations

Posted on:2015-08-09Degree:Ph.DType:Thesis
University:Northwestern UniversityCandidate:Honbo, DanielFull Text:PDF
GTID:2478390017994164Subject:Computer Engineering
Abstract/Summary:PDF Full Text Request
With the current norm of multi-core processors, stagnant clock rates, and slowing gains from instruction level parallelism, it has become increasingly important to exploit parallelism in order to achieve acceptable performance for data intensive tasks. While multi-core processors are fine for exploiting thread-level parallelism, they are often a suboptimal choice for problems that exhibit abundant data parallelism. This thesis investigates the application of Graphics Processing Units (GPUs) and Field Programmable Gate Array (FPGA) coprocessors for data intensive, data parallel workloads.;Since adopting a unified shader architecture and a general programming model, GPUs have become an increasingly important alternative to general-purpose processors for compute intensive applications, since they feature peak floating-point performance well above that of general-purpose processors. We investigate GPU coprocessors for a simple particle simulation and demonstrate the performance benefit of offloading spatial transformations and basic particle motion calculations to a GPU. We also study a GPU coprocessor for the k-Means clustering algorithm and demonstrate application speedups of 40-70x.;FPGAs are hardware devices capable of implementing arbitrary digital circuits. The vast internal bandwidth and low power consumption afforded by these devices makes them an attractive target for certain data parallel workloads. We investigate FPGA architecture for Decision Tree Classification that can achieve a speedup of 30x for the split determination phase of the algorithm. We also present a fast pairwise statistical significance estimation architecture using an FPGA coprocessor that offloads the alignment task to an accelerator designed to concurrently process multiple independent alignments, resulting in an end-to-end speedup of over 200x over a baseline software implementation.
Keywords/Search Tags:FPGA, Processors, Data intensive, GPU, Parallelism
PDF Full Text Request
Related items