Font Size: a A A

Research On Hardware Techniques For Thread Level Parallelism

Posted on:2004-07-27Degree:DoctorType:Dissertation
Country:ChinaCandidate:X ZhuFull Text:PDF
GTID:1118360122960992Subject:Computer application technology
Abstract/Summary:PDF Full Text Request
This dissertation is sponsored by National "Tenth Five" advance research project. High performance microprocessor architecture is investigated and an embedded 32-bit microprocessor ARMP (Aviation microelectronic center RISC Microprocessor) is researched and developed. The ARMP, which is controlled by a pipeline mechanism, has excellent real time performance and supports precise interrupt. The ARMP is compatible to PowerPC 603e Instruction Set Architecture (ISA), and will be implemented by 0.25 m CMOS technique. The transistor count is 3.8 million; the package is surface mount 240-pin quad flat pack (QFP240); and the die size is 98mm2;The most advanced research progress on microprocessor architecture is extensively studied to get technical preparations for microprocessor design. The microprocessor architecture design has entered the era of thread level parallelism. Multithreaded microprocessor, which has many hardware contexts sharing an execution core, can efficiently exploit both the instruction level parallelism and thread level parallelism to acquire higher performance and better performance/power ratio. In February 2002, Intel Inc. announced the Hyper-Threading technique which is used in Intel Xeon? This fact indicates Intel Xeon?is a multithreaded microprocessor which has two hardware contexts sharing an execution core.With such research background, this dissertation focuses on the research of hardware techniques for thread level parallelism in high performance microprocessors, especially the multithreaded microprocessor which has superscalar execution core.Firstly, for the purpose of research and verification of multithread microprocessor, a superscalar microprocessor model ARMP-V2 is built on the basis of ARMP microprocessor;Secondly, the issue logic is not only the critical path in a superscalar microprocessor, but also critical to the performance of a multithreaded microprocessor with superscalar execution core. Two issue logic schemes, which are fit for a multithreaded microprocessor, are proposed. The Issue Enable Table (IET) scheme can effectively reduce the comparators needed by the wakeup logic and the energy consumed by the wakeup logic as well. The Effective Dependence Matrix (EDM) scheme can reduce the wakeup logic delay. Moreover, a new approach to assign issue queue entries in simultaneous multithreading architecture is provided. The method canuse issue queue entries more efficiently than old ones.Thirdly, control speculation is widely used in the high performance microprocessor. In a multithreaded microprocessor which has a superscalar execution core, with the issue width being larger and the pipeline getting deeper, the misprediction penalty will become longer. Therefore, efficient control-flow handling will continue to be one of the central challenges in microarchitecture design. This dissertation proposes the selective dual path execution in multithreaded microprocessor to reduce the misprediction penalty. Confidence estimation is used to evaluate the possibility of a branch prediction to be correct. If the confidence estimation estimates a branch prediction to be a low confidence one and there is idle hardware context in a multithreaded microprocessor, two target paths after the branch instruction are executed.Fourthly, to obtain a confidence estimation scheme which is suitable for selective dual path execution, control speculation in high performance microprocessor is extensively studied. A novel confidence estimation mechanism, Decrease Constant or Reset (DCR) scheme, is developed. The DCR scheme can improve the possibility of an incorrect branch prediction to be identified as low confidence prediction, and the possibility that a low confidence estimate is for an incorrectly predicted branch. Compared with the Misprediction Distance Counter (MDC) scheme, the DCR scheme can improve the SPEC and PVN metrics of confidence estimation by 151.8% and 42.19%, respectively.Finally, the DCR scheme is used to guide the creation of selective dual path execution. The sele...
Keywords/Search Tags:Microprocessor, Thread Level Parallelism, Multithreaded Microprocessor, Issue Logic, Misprediction Penalty, Confidence Estimation, Seletive Dual Path Execution
PDF Full Text Request
Related items