Font Size: a A A

Parallelizing matrix -oriented computing applications on reconfigurable computing fabrics: Methodology and architecture

Posted on:2007-01-16Degree:Ph.DType:Dissertation
University:University of South CarolinaCandidate:Akella, SreesaFull Text:PDF
GTID:1448390005972668Subject:Computer Science
Abstract/Summary:
The reconfigurable computing machines (RCM) paradigm and the predominant FPGA device architecture provide us the capability to exploit high degrees of parallelism and obtain significant performance gain over microprocessor-based systems. It would seem that---given a sufficiently large custom computing fabric---we could achieve greater than two orders of magnitude speedup seen thus far on RCM platforms. However, we find that the limited number of memory banks on commercial RCMs inhibits forming massively parallel computations because, as a result of serializing access to these memories, the computations become starved and, thus, are serialized.;In this dissertation defense, we present a novel architecture, MPP-RCM, a massively parallel processing, reconfigurable computing machine architecture. Aside from a large number of processing element (PE) units, MPP-RCM employs a large number of small memory elements (ME) for each PE. We obtain highly scalable speedup by abandoning traditional Von Neumann memory architectures---where memory is considered as one or more large physical arrays---and employing a large number of smaller, distributed memory units, each uniquely addressable in parallel.;We've chosen two matrix-oriented processing applications, namely, Sparse Matrix Vector Multiplication kernel (SMVM) and Un-weighted Pair Group Method with Arithmetic Means (UPGMA), for experimental validation of MPP-RCM. We scaled them to large numbers of PEs and MEs for a large range of problem sizes. We analyze the system partitioning, data dependency and data locality characteristics of the architecture for these two applications, and assess speedup and scalability of MPP-RCM using the Amdahl's and Gustafson's Law metrics. We present the performance gains achieved and discuss the issues that affect the scalability and overall performance of the architecture. Finally, we discuss possible "footprints" for a physical RCM platform built using XilinxRTM FPGA devices based on the MPP-RCM logical architecture.*.;*This dissertation is a compound document (contains both a paper copy and a CD as part of the dissertation).
Keywords/Search Tags:Architecture, Reconfigurable computing, RCM, Applications, Parallel
Related items