Compiler optimization of value communication for thread-level speculation

Posted on:2006-01-19

Degree:Ph.D

Type:Dissertation

University:Carnegie Mellon University

Candidate:Zhai, Antonia

Full Text:PDF

GTID:1458390008469186

Subject:Computer Science

Abstract/Summary:

In the context of Thread-Level Speculation (TLS), inter-thread value communication is the key to efficient parallel execution. From the compiler's perspective, TLS supports two forms of inter-thread value communication: speculation and synchronization. Speculation allows for maximum parallel overlap when it succeeds, but becomes costly when it fails. Synchronization, on the other hand, introduces a fixed cost regardless of whether the dependence actually occurs or not. The fixed cost of synchronization is determined by the critical forwarding path, which is the time between when a thread first receives a value from its predecessor to when a new value is generated and forwarded to its successor. In the baseline implementation used in this dissertation, we synchronize all register-resident values and speculate on all memory-resident values. However, this naive approach yields little performance gain due to the excessive cost from inter-thread value communication. The goal of this dissertation is to develop compiler-based techniques to reduce the cost of inter-thread value communication and improve the overall program performance.; This dissertation proposes to use the compiler to orchestrate inter-thread value communication for both memory-resident and register-resident values. To improve the efficiency of inter-thread value communication, the compiler must first decide whether to synchronize or to speculate on a potential data dependence based on how frequently the dependence occurs. If synchronization is necessary, the compiler will then insert the corresponding signal and wait instructions, creating a point-to-point path to forward the values involved in the dependence. Because synchronization could serialize execution by stalling the consumer thread, we use the compiler to avoid such stalling by applying novel dataflow analyses to schedule instructions to shrink the critical forwarding path.; This dissertation reports the performance impact of several compiler-base value communication optimization techniques on a four-processor single-chip multiprocessor that has been extended to support thread-level speculation. Relative to the performance of the original sequential program executing on a single processor, for the set of loops selected to maximize program performance, parallel execution with the proposed baseline implementation results in 1% performance degradation for integer benchmarks and 21% performance improvement for floating point benchmarks, while with the optimization techniques we developed, parallel execution achieves 22% and 42% performance improvement for integer benchmarks and floating point benchmarks, respectively.

Keywords/Search Tags:

Value communication, Compiler, Parallel execution, Speculation, Thread-level, Optimization, Performance, Benchmarks

Related items

1	Research On The Sufficient Thread Submission Mechanism In Software Thread Level Speculation Systems
2	Research On The Thread-level Speculation Execution Model For LLVM Compiler
3	HOT Thread Level Speculation Research Based On OpenMP
4	Compiler techniques for thread-level speculation
5	Study On The Key Technologies Of Thread-Level Speculation On Multi-core Platform
6	Optimization And Analysis Of The Noc To Reduce Squashes In Thread Level Speculation
7	Research And Implementation Of Simulation Environment For Thread Level Parallelization On Multi-core Architecture
8	Research On Parallel Model And Compiler Optimization Technique Based On Multi-core
9	Research On Mechanism Of Thread-Level Speculation Based On Pthreads
10	The Research And Implementation Of The Key Techniques On Single Chip Multiprocessors