Font Size: a A A

Research On Clustering Algorithm And Optimization On Complex Operation For Mluti-cluster VLIW DSP

Posted on:2015-02-04Degree:MasterType:Thesis
Country:ChinaCandidate:C F DingFull Text:PDF
GTID:2268330428999851Subject:Computer application technology
Abstract/Summary:PDF Full Text Request
Modern digit signal processor (DSP) achieve instruction level parallelism with superscalar or very long instruction word (VLIW). Very long instruction word digit signal processor will assign recognition and scheduling of parallel task to compiler, so we need a very ambitious complier to use VLIW DSP well. Complier will do the scheduling work instead of hardware.BWDSP is a high performance digit signal processor which is developed independently by a research institute of China Electronics Technology Group. It adopts a very long instruction word architecture. In order to use C program language to develop application, we transplant Open64compilation infrastructure to BWDSP. Besides, we do some optimization according to architecture and instruction set of BWDSP.BWDSP use more than one cluster, so we need a clustering algorithm to use resource of BWDSP. The sake of clustering algorithm is that each instruction can be assigned to appropriate cluster. However, Within the framework of clusters, a function unit can only visit the registers in its own cluster. Therefore, if a instruction need to use the result of other instruction in other cluster, Transfer instructions between clusters will be used. Clustering algorithm need to balance parallelism and transfer between clusters. This article proposes a clustering algorithm based on SSA (static single assignment) form data flow diagram. Firstly, our algorithm construct static single assignment form data flow diagram. By buttom-up traveling the data flow diagram, we calculate the score of resources and inter-cluster transmission for each instruction. We choose the cluster which get the highest score as selected cluster. By experiment, this article comfirm our algorithm can improve the compilation performance.One of the most common operations on DSP is complex operations, such as FFT which contain lots of complex operations. It is considered at the time of design, so there are many complex instructions in the instruction set of BWDSP. However, complex operations are not be optimized in Open64. So, we optimize complex operations to get higher performance. First, we add machine description for complex operations. Than we use compiler directives to recognize complex operations. At last, we compound a complex instruction. Besides, we modify the cluster assigning and register allocation for complex operations.
Keywords/Search Tags:BWDSP, Open64, very long instruction word, cluster assigning, complex operations, static single assignment form
PDF Full Text Request
Related items