Font Size: a A A

Optimal Design Of Coarse-Grained Reconfigurable Unit For Control Flow Acceleration

Posted on:2021-11-12Degree:MasterType:Thesis
Country:ChinaCandidate:Y Y ZhaoFull Text:PDF
GTID:2518306557989989Subject:IC Engineering
Abstract/Summary:PDF Full Text Request
Compared with general-purpose processor architectures and application-specific processing architectures,Coarse-Grained Reconfigurable Architecture has unique advantages because of its high energy efficiency and flexibility.However,CGRA can't handle the control flow structure existing in applications efficiently because of the characteristics of data-flow driven.As a comprehensive control flow processing scheme,TIA(Triggered Instruction Architecture)can simultaneously realize the basic functions of processing loop branch and conditional branch.But the scheme has lower performance because each instruction needs to be triggered and there is a register dependency between instructions.This thesis improves the mechanism that every instruction needs to be triggered in TIA.Multiple instructions are triggered at once by redefining the trigger flag of the instruction when the instruction order is determined.And the predication register is updated according to different types of instruction by adding flags.The dependency of predication register and the number of triggers is reduced by the multi-trigger mechanism to improve the performance.Although the improved TIA scheme has better performance when dealing with shallow nested long path structures due to multiple operations triggered at once,it needs to be triggered frequently to deal with branch structures with more nested layers and shorter branching paths,resulting in a small performance benefit.Therefore,this thesis classifies the control flow,processes the shallow long-path structure based on the multi-trigger mechanism.Deep nested short-path branch execution is processed by tag comparison and parallel tag rewriting based on tag-based full predication(TFP).Further,the performance is improved by eliminated the tag comparison operation of conditional branch instructions.In this thesis,the improved TIA is called HTFP(Hybrid Triggered Full Predication),and the hardware architecture of the PE unit is designed based on the principle of the improved TIA.This thesis completes the RTL implementation of HTFP,TIA,and TFP.The control-intensive loop bodies extracted from Mibench and SPEC CPU2006 are used to be test cases.Based on the manual mapping,the three schemes are verified on the Vivado simulator.The intermediate variable results of the simulation process are used to prove the functional correctness,and the initiation intervals are used to compare and analyze the performance.Finally,the three designs are synthesized using Design Compiler in TSMC 40 nm and 50 MHz frequency to evaluate the power consumption.Experimental results show that,compared with TIA and TFP,the performance of HTFP is improved by 23.6% and 16.9%,and the power consumption is only increased by 2.38% and 9.75%,respectively.
Keywords/Search Tags:CGRA, Control-flow acceleration, Triggered mechanism, Predication execution, PE unit
PDF Full Text Request
Related items