Font Size: a A A

Mapping and scheduling hardware tasks on the high-performance reconfigurable architectures

Posted on:2010-05-19Degree:Ph.DType:Dissertation
University:The George Washington UniversityCandidate:Huang, MiaoqingFull Text:PDF
GTID:1448390002983911Subject:Computer Science
Abstract/Summary:
High-Performance Reconfigurable Computers (HPRCs) are traditional High-Performance Computers (HPCs) augmented with reconfigurable hardware co-processors, typically based on Field-Programmable Gate Arrays (FPGAs). HPRCs are capable of providing significant performance improvements for many scientific and engineering applications. Executing a hardware task graph on an FPGA consists of two steps. The first step, mapping step, is to select proper hardware implementation for each task if multiple implementation variants are available. The second step is to schedule hardware tasks into multiple FPGA configurations in an efficient way. This research investigates hardware task mapping and scheduling optimization mechanisms for HPRC systems for improved performance.;In this research, a hardware task scheduling technique, known Reduced Data Movement Scheduling (RDMS), is proposed to maximize the performance under real-life constraints. RDMS schedules hardware tasks into the least number of configurations and significantly reduces inter-configuration communication. Furthermore, RDMS leverages the data dependency among the tasks to form longer pipelines in order to improve the throughput of each single configuration. Compared with existing scheduling approaches, RDMS was shown to reduce inter-configuration communication by up to 46% based on simulation using randomly generated data flow graphs. The practicality and efficiency of the proposed algorithm were demonstrated by emulating a task graph from a real-life application, N-body simulation, under realistic constraints for bandwidth and FPGA parameters from existing HPRCs including Cray XD1 and SRC-6.;Additional improvement was introduced by incorporating a hardware library of architectural variants. Multiple implementation variants for the same hardware task enabled tradeoffs between the hardware resources consumed and the task execution throughput. A genetic algorithm (GA)-based mapping approach is developed to find the near-optimal solution, i.e., combination of task implementations, in a reasonable time. Each chromosome represents a possible mapping between hardware tasks and their implementation variants. Actual numbers for the architectural constraints, such as interconnect bandwidth and reconfiguration time, are used from three different reconfigurable platforms - SGI RC100, SRC-6 and Cray XD1. The results demonstrated improvements of up to 78.6% in the execution time, compared with choosing a fixed implementation variant for each task.
Keywords/Search Tags:Hardware, Task, Reconfigurable, Performance, Mapping, Scheduling, Implementation, RDMS
Related items