Font Size: a A A

Research On Inner Loop Unrolling Method For Vector DSP

Posted on:2022-10-30Degree:MasterType:Thesis
Country:ChinaCandidate:H S LuFull Text:PDF
GTID:2518306761496554Subject:Computer Software and Application of Computer
Abstract/Summary:PDF Full Text Request
In modern processors,the VLIW architecture with a vector processing unit has gradually become a typical representative of high-performance digital signal processors(DSP).This type of structure has the characteristics of rich register resources and multiple execution units.The application of loop unrolling optimization during the compilation of DSP algorithm program code can make better use of hardware resources to improve the performance of the code.The effect of loop unrolling is mainly decided by unrolling factor.However,the traditional loop unrolling factor selection methods have limited consideration for hardware resource characteristics,thus their effect in exploring the instruction-level parallelism of vector DSP codes is not good enough.This paper studies loop unrolling on the basis of the characteristics of vector DSP,establishes an inner loop unrolling method,and proposes a corresponding loop unrolling factor selection algorithm and related supporting algorithms.The work of this paper mainly includes the following three aspects:1)The unrolling factor selection algorithm VCLUF for inner loops.Considering the characteristics of the vector DSP architecture with multiple types of heterogeneous register resources and multiple execution units,an inner loop unrolling factor selection model is constructed and the algorithm for determining the corresponding loop unrolling factor value is studied.The algorithm focuses on factors such as the scalar vector attribute of the instructions in a loop,the resource usage rules of the base address registers and the index registers,and explores a heuristic factor for the usage of execution units.For the case where the inner loop code is mainly vector processing instructions or scalar processing instructions,experiments show that the algorithm can find a more suitable unrolling factor;2)Inner loop unrolling factor selection algorithm SVCLUF for scalar vector synthesis code.Better performance can be obtained by DSP algorithms whose code comprehensively uses the vector processing unit and the scalar processing unit.Based on the VCLUF algorithm,this paper considers that there are both scalar processing codes and vector processing codes in the loop body,and further proposes the SVCLUF algorithm to determine the inner loop unrolling factor of the scalar vector comprehensive code.The algorithm establishes an improved processing model by further distinguishing the influence of hardware scalar resources and vector resources on the selection of loop unrolling factors.Experiments show that the algorithm can find a more suitable unrolling factor for the inner loop with comprehensive scalar vector code;3)Analysis of loop body code information and loop unrolling processing.In order to provide the information needed to support the execution of the loop unrolling factor selection algorithm,a loop body code information analysis algorithm is designed to analyze the vector attributes and the type of execution unit of each instruction in the inner loop code,the register classes of the variants and invariants.In addition,based on the information analysis algorithm and the above-mentioned VCLUF and SVCLUF algorithms,the relevant loop unrolling sub-algorithms are improved according to the structural characteristics of vector DSP,thereby a complete loop unrolling processing scheme is provided.These sub-algorithms mainly include the recognition algorithms and processing algorithms which distinguishes the scalar attribute from vector attribute of induction variants,as well as the transplanted algorithms of the generation of unrolled loop,the generation of tail loop,and register renaming.
Keywords/Search Tags:Vector DSP, Compiling technology, VLIW, Loop unrolling, Unrolling factor
PDF Full Text Request
Related items