Font Size: a A A

Research On Compiler Method For Dynamic Boundary Loop For Coarse Grained Reconfigurable Processor

Posted on:2020-07-24Degree:MasterType:Thesis
Country:ChinaCandidate:S XieFull Text:PDF
GTID:2428330620958900Subject:Integrated circuit engineering
Abstract/Summary:PDF Full Text Request
Coarse-grained Reconfigurable Architecture(CGRA)have been identified as a desirable platform for computationally intensive applications.However,existing mapping techniques only focus on the single-level loop and the innermost loop body.It limits the application areas for CGRA since the lack of efficient mapping skills for the dynamic boundary loop.This paper address the problems for single-level dynamic boundary loop and mixed boundary loop.CGRA is consist of regular processing element array(PEA).On the one hand,the functionality of PE is simple to acquire high performance.On the other hand,it cannot process complex control flow due to the lack of program counter(PC)and other hardware support.Firstly,we analyze the main structure of CGRA and the characteristics of existing mapping algorithms.Then,a low-cost extended CGRA is provided to support the mapping of the dynamic boundary loop.The Dynamic Boundary Static Schedule(DBSS)is proposed to map the dynamic boundary loop based on the extended CGRA.Compared to traditional mapping algorithms,DBSS maps loop body and loop itself about control-related operators at the same time.DBSS issues loop body at runtime according to loop condition.DBSS can map not only nested branches but also the single-level dynamic boundary loop.To address the mapping of hybrid static boundary loop and dynamic boundary loop,we proposed Mixed Boundary Static Schedule(MBSS).DBSS will involve lots of control-related operators about the static boundary loop and the dynamic boundary loop.MBSS adapts conventional loop unrolling to remove the layer of the static boundary loop firstly.Then,MBSS process the rest of layers of the dynamic boundary loop.MBSS only map control-related operators about the dynamic boundary loop,which improves the performance and simplifies the data flow graph.Finally,the extended CGRA is realized by Verilog and compiler skills is inserted as a pass into LLVM.Compared to the latest mapping algorithms,DBSS and MBSS achieve 2.2 × speedup on average and take performance improvement of 24% and 38% respectively.What's more,DBSS and MBSS save energy and the extra hardware overhead less than 2%.The proposed method owns better scalability and flexibility.
Keywords/Search Tags:CGRA, PE, dynamic boundary loop
PDF Full Text Request
Related items