Font Size: a A A

Compiling For The Speculative Multithreading Architecture

Posted on:2002-05-12Degree:DoctorType:Dissertation
Country:ChinaCandidate:K DengFull Text:PDF
GTID:1118360065461567Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
High-performance,general-purpose microprocessors serve as compute engines for computers ranging from personal computers to supercomputers. Sequential programs constitute a major portion of real-world software that run on the computers. State-of-the-art microprocessors exploit instruction level parallelism (ILP) to achieve high performance on applications by searching for independent instructions in a dynamic window of instructions and executing them on a wide-issue pipeline. Increasing the window size and the issue width to extract more ILP may hinder from achieving high clock speed,limiting over-all performance,especially for the forthcoming billion-transistor per-chip era.The Speculative Multithreading Architecture (SMA) employs a de-centralized organization to construct multiple small windows and many narrow-issue execution units to exploit massive ILP. Sequential programs are partitioned into code fragments called threads,which are speculatively executed in parallel. Previous research showed the SMA architecture could achieve substantial performance boost and efficient resource utilization.Compiler optimization holds a very important position in SMA research. There are three key factors baffle the SMA processor:context load imbalance,inter-thread control dependence and inter-thread data dependence. To maintain performance boost,the SMA compiler must eliminate those factors thoroughly.The work of this paper include:1 The paper thoroughly investigates execution behavior of various applications on SMAarchitecture. Key performance factors are also presented..2 A set of heuristic rules is presented to accelerate speculative execution of SMA threads. Rules include optimized thread partition strategy,.contexts load balance strategy and DEE-like thread mapping strategy.3 Thoroughly reviewed memory bandwidth requirement of SMA processor and difference of various instruction fetch policies. To improve cache performance under SMA model,the paper introduces hardware software co-operative optimization. On the software side,compiler inserts prefetch instructions explicitly;on the hardware side,an SMA cache filter is added to cut down unnecessary prefetch.4 Guided by feedback-based optimization strategy,the paper presents a dynamic profile based continuous optimization framework - SMARCOF. Based on the DLX simulator,SMARCOF is modified with SMA specific extension and heuristic optimizing rules. Simulation of SPEC code shows that above rules could exploit hybrid parallelism effectively with rather low overhead.Conclusively,the SMA architecture is a promising way to implement high performance processor;the continuous optimization framework SMARCOF can utilize dynamic execution profiles and heuristic rules to eliminate SMA performance hindrance effectively. Preliminary work discussed in this thesis showed encouraging performance boost potential and application compatibility of SMARCOF. Future improvement could be expected.
Keywords/Search Tags:SMA, Compiler Optimization, Prefetch, Dynamic Execution Profile, Feedback-based Optimization
PDF Full Text Request
Related items