Font Size: a A A

Exploiting choice: Resource sharing control in Simultaneous MultiThreading microarchitectures

Posted on:2009-12-21Degree:Ph.DType:Dissertation
University:University of California, IrvineCandidate:Liu, ChenFull Text:PDF
GTID:1448390002498036Subject:Engineering
Abstract/Summary:
Simultaneous MultiThreading (SMT) achieves improved system resource utilization and accordingly higher instruction throughput because it exploits Thread-Level Parallelism (TLP) in addition to conventional Instruction-Level Parallelism (ILP). The key to high-performance SMT is to optimize the distribution of shared systern resources among the threads. Theoretically, system resources in every pipeline stage of an SMT nucroarchitecture can be dynamically shared. However, existing dynamic sharing mechanisms have no control over the resource distribution, which could cause one thread to grab too many resources and clog the pipeline. Existing fetch policies address the resource distribution problem indirectly by selecting the instructions from which thread(s) to inject into the pipeline, by which means to control the "input" to the pipeline. These fetch policies generally make decisions on prioritizing the threads by examining certain parameters from the pipeline stages as an indication of their performance, like number of instructions, number of caches misses and/or number of unresolved branches. However, these parameters may not be necessitate to reflect the actually run-time performance of the thread, which even diverge in same cases. In this work, we strive to quantitatively determine the balance between controlling resource allocation and dynamic sharing of different system resources with their impact on the performance of SMT processors. Through simulating different controlling mechanisms, we find that controlling the resource sharing of either the Instruction Fetch Queue (IFQ) or the ReOrder Buffer (ROB) is not sufficient if implemented alone (2% performance increase and 8% performance decrease respectively in harmonic mean when compared with dynamic sharing). However, controlling the resource sharing of both the IFQ and the ROB can yield an average performance gain of 38% in harmonic mean, or 68% in geometric mean, when compared with dynamic sharing. Corresponding to this performance improvement, the average L1 D-cache miss rate has been reduced by 28--33%. The average time that the instruction resides in the pipeline has been reduced by 34%. These all demonstrates the power of the resource sharing control mechanism we propose.
Keywords/Search Tags:Resource, Thread, SMT, Pipeline
Related items