Font Size: a A A

parallel_dp: The parallel dynamic programming design pattern as an IntelRTM Threading Building Blocks algorithm template

Posted on:2013-04-06Degree:M.SType:Thesis
University:University of Arkansas at Little RockCandidate:Serfass, DouglasFull Text:PDF
GTID:2458390008968567Subject:Computer Science
Abstract/Summary:
Our work was composed of three parts, with the goal of selecting either the Cilk or Go languages or C++ Intel® Threading Building Blocks (TBB) in order to create a template for parallel dynamic programming. Use of a template for parallel dynamic programming will enable programmers to attain increased speed from multiple-core processors without having be experts in synchronization, load balancing and cache optimization. In part one, we compared Cilk and Go performance. We created a scalable parallel algorithm based on parallelizing the merge sorting network algorithm. We implemented our scalable parallel algorithm using Go and Cilk. Both Cilk and Go implementations increased computation speed for larger problem sizes. We compared Go and Cilk scheduling and synchronization efficiency and found that Cilk is more efficient and, thus, delivers higher speedup than Go. We compared the language efficiency of Go and Cilk and found that Go is faster than Cilk. Based on this research, we decided to eliminate Cilk as a candidate for a template for parallel dynamic programming. In part two, we replaced Cilk with TBB in our research, and compared Go and TBB performance. Both Go and TBB offer high level parallel programming mechanisms built on top of threads. Go goroutines and TBB task classes are used as the computation units that are mapped to physical threads on multi-core processors. The synchronization mechanisms in Go and TBB are the channel and the task scheduler, respectively. In part two we utilized these mechanisms to implement a parallel version of the optimal binary search tree dynamic programming algorithm in Go and TBB. Both implementations tile the iteration space and construct and evaluate a direct acyclic task graph for optimal parallelism without over constraints. We compared Go and TBB speedup and performance to create a benchmark of how efficient Go and TBB are at evaluating a direct acyclic task graph. Our experimental results show that the overhead of task scheduling and synchronization in TBB is much smaller than Go and that the overall performance of TBB is 1.6 to 3.6 times faster than Go. Based on this research, in part three, we decided to enhance TBB in order to create a template for parallel dynamic programming. TBB is an ideal environment for implementation of the parallel dynamic programming design pattern. The task-based parallelism of TBB readily lends itself to the realization of the participants and participant collaboration of this design pattern. We propose the parallel_dp algorithm template, an implementation of the parallel dynamic programming design pattern using TBB. We define the participants and the participant collaboration of this design pattern and describe the classes and functions utilized to implement the design pattern in parallel_dp. We analyze the performance of our solution by applying parallel_dp to create four TBB programs that are parallel versions of the four types of dynamic programming algorithms. Our experimental results prove that parallel_dp provides speedup to all of our TBB programs and near linear speedup to one of our TBB programs. We conclude that parallel_dp will improve the performance of any TBB program that includes a dynamic programming algorithm.
Keywords/Search Tags:Dynamic programming, Parallel, TBB, Algorithm, Cilk, Performance, Part
Related items