Font Size: a A A

Research On Architecture Design And Modeling Method Of Coarse-Grained Reconfigurable Array Based On Dataflow Decoupling

Posted on:2021-11-29Degree:MasterType:Thesis
Country:ChinaCandidate:T HongFull Text:PDF
GTID:2518306503464734Subject:Electronic Science and Technology
Abstract/Summary:PDF Full Text Request
In recent years,coarse-grained reconfigurable array(CGRA)architecture is gaining reviving interest owing to its high energy efficiency for domain-specific applications.The coarse-grained reconfigurable array is configured by the task information transmitted by the host,and uses the reconfigurable processing elements and interconnection to realize the computing functions required by different applications.With the expansion of application field and scale,the dataflow graph of reconfigurable computing becomes more and more complex,which makes the mapping of application on large-scale spatial structure more difficult.In the process of array execution,dataflow has different execution rate because of the influence of control,memory access and other factors.The coupling of these dataflows with different rates in the spatial structure of the array results in the pipeline stall,which has a negative impact on the performance of the reconfigurable array.We research this problem,and the main work is as follows:1.In this paper,a dataflow decoupling optimization scheme is proposed.By removing non-data-related synchronization and lightweight storage space,dataflow regions with different rates can be isolated on the iteration,which improves the performance of reconfigurable array.2.The performance loss caused by control synchronization in application is also summed up as multiple rate area coupling problem.The decoupling technology proposed in this paper can also decouple the coupling caused by control synchronization.The decoupling technology combined with the customization of the loop synchronization protocol can improve the control flexibility and performance under the complex loop nesting.3.On this basis,the design of array module structure,dataflow control method and mapping method based on dataflow decoupling and loop customization are proposed.4.In order to verify the effectiveness of this research,a simulator with fast architecture iteration and bottleneck exploration capability is designed.The simulator provides an experimental platform composed of a cycle accurate array and a cycle approximate memory model.The experimental results show that in the selected application,compared with the baseline reconfigurable architecture,memory access and execution decoupling can achieve a performance improvement of4.15× and inter loop decoupling can bring a performance improvement of52% in nested loops.The experiment also combines all decoupling optimization methods for comprehensive evaluation,and achieves an average performance optimization of 2.80× and 34% compared with different CGRA in the selected application,and compared with the CPU performance is 3.20×.Experiments show that the coarse-grained reconfigurable architecture based on dataflow decoupling optimizes the pipeline stall and control flexibility.
Keywords/Search Tags:coarse-grained reconfigurable array, decoupling, loop control, indirect access
PDF Full Text Request
Related items