Architectural and compiler mechanisms for accelerating single thread applications on multicore processors

Posted on:2009-07-17

Degree:Ph.D

Type:Dissertation

University:University of Michigan

Candidate:Zhong, Hongtao

Full Text:PDF

GTID:1448390005958961

Subject:Computer Science

Abstract/Summary:

Multicore systems have become the dominant mainstream computing platform. One of the biggest challenges going forward is how to efficiently utilize the ever increasing computational power provided by multicore systems. Applications with large amounts of explicit thread-level parallelism naturally scale performance with the number of cores. However, single-thread applications realize little to no gains from multicore systems.; This work investigates architectural and compiler mechanisms to automatically accelerate single thread applications on multicore processors by efficiently exploiting three types of parallelism across multiple cores: instruction level parallelism (ILP), fine-grain thread level parallelism (TLP), and speculative loop level parallelism (LLP).; A multicore architecture called Voltron is proposed to exploit different types of parallelism. Voltron can organize the cores for execution in either coupled or decoupled mode. In coupled mode, several in-order cores are coalesced to emulate a wide-issue VLIW processor. In decoupled mode, the cores execute a set of fine-grain communicating threads extracted by the compiler. By executing fine-grain threads in parallel, Voltron provides coarse-grained out-of-order execution capability using in-order cores. Architectural mechanisms for speculative execution of loop iterations are also supported under the decoupled mode. Voltron can dynamically switch between two modes with low overhead to exploit the best form of available parallelism.; This dissertation also investigates compiler techniques to exploit different types of parallelism on the proposed architecture. First, this work proposes compiler techniques to manage multiple instruction streams to collectively function as a single logical stream on a conventional VLIW to exploit ILP. Second, this work studies compiler algorithms to extract fine-grain threads. Third, this dissertation proposes a series of systematic compiler transformations and a general code generation framework to expose hidden speculative LLP hindered by register and memory dependences in the code. These transformations collectively remove inter-iteration dependences that are caused by subsets of isolatable instructions, are unwindable, or occur infrequently.; Experimental results show that proposed mechanisms can achieve speedups of 1.33 and 1.14 on 4 core machines by exploiting ILP and TLP respectively. The proposed transformations increase the DOALL loop coverage in applications from 27% to 61%, resulting in a speedup of 1.84 on 4 core systems.

Keywords/Search Tags:

Applications, Multicore, Compiler, Systems, Mechanisms, Single, Architectural, Thread

Related items

1	Architectural and compiler support for DSP applications
2	The velocity compiler: Extracting efficient multicore execution from legacy sequential codes
3	Co-optimization Design Of Multicore Systems For High Performance Computing Nodes
4	Compiler techniques for thread-level speculation
5	Compiler optimizations for SIMD/GPU/multicore architectures
6	Research On The Encryption Algorithm Of Linear Chaos In Multicore Based On OpenMP And Thread Optimization
7	Architectural and compiler techniques for microprocessor power and performance management
8	Architectural and compiler issues for tolerating latencies in horizontal architectures
9	The Research And Implementation Of The Key Techniques On Single Chip Multiprocessors
10	Virtual Private Machines: A resource abstraction for multicore computer systems