Font Size: a A A

Parallelization Of Digital Signal Transforming Functions Based On Multicore VLIW DSP

Posted on:2016-04-14Degree:MasterType:Thesis
Country:ChinaCandidate:Y ZhenFull Text:PDF
GTID:2308330470957730Subject:Computer application technology
Abstract/Summary:PDF Full Text Request
With the rapid development of multimedia industry, digital signal processing technology, along with the digital signal processor, is becoming more and more important as the researchers have their focuses on trying to improve the performance of software applications relative to DSP. However, as the growth of complexity of the processor architecture, developers find themselves unable to write codes which are efficient enough to fulfill the best potential of processors.Code optimization,which is based on hardware features of target platform, is considered to be fairly important in computer science fields. The key point of optimization is to improve the efficiency of memory accessing.This paper researched the subject on how to optimize the Digital Signal Processing function library based on BWDSP100, which implements Multicluster VLIW(Very Long Instruction Word) and SIMD architecture and is the newest powerful DSP designed by one of the best research institues in China and has seek its usage widely in aspects of digital signal processing, image processing and telecommunication. Digital Signal Processing function library, abbreviated as libDSP, which includes Window fcuntions,Digital Signal Transforming functions, Statistics Signal Processing functions, Linear Prediction functions,MultiLevel Signal Processing functions, Wave generating functions,Digital Filter functions and Basic subroutines,is the embedded software package for BWDSP100. As the software fundamental of BWDSP100platform, libDSP offers APIs(Application Program Interfaces) for all high-level applications. It is fair enough to say that libDSP is the crucial part of BWDSP100platform,and the practical performance of BWDSP100is mostly dependent on the efficiency of libDSP.However, libDSP has not been optimized to suitably fit the hardware advantages and thus is not efficient enough,which in turn prevents BWDSP100from releasing its full potential. This article studys several methods to modify the libDSP and make it better. Considering the function lib being intricate and the number of subroutines being enormous, this paper choose Digital Signal Transforming functions,which are the most important parts of libDSP, as its focus of working. Digital Signal Transforming functionswill be abbreviated as DST functions henceforth. The whole study comes off in three ways:(1) Reconstruct the implementation of DSP lib by means of harnessing special assembly instructions provided by BWDSP100to cut the number of code lines as well as to decrease the running time of programs. BWDSP100supports accumulation instructions, complex operating instructions, maximum instructions, minimum instructions, fixed-point arithmetic instructions and SPU instructions.(2) Apply loop unrolling technology on linear loop calculations,which is the most time-consuming section of programs. Loop unrolling cuts the total times of loop through increasing amount of calculations in every iteration:This technology works basically for two reasons:on one hand,it helps reduce operations that are unnecessarily contributive to final results;on the other, it elongates codes, paves the way for further optimizations such as instruction reordering technology, variable-renaming technology and reducing instruction dependency and data dependency.(3) Reorder the codes to achieve better performance as BWDSP100allows16-issue.This is a two way boost:firstly,the pipeline can be fully instilled so as to reduce waiting time;secondly, the norm of loop unrolling can be increased further on,which in turn, contributes to the promotion of DSP lib.Experimental results show that the speedup for all functions is greater than9,meanwhile80%of them can achieve a speedup above lO.This article can be regarded as a guidence to other studies.
Keywords/Search Tags:Very Long Instruction Word, SIMD, Digital Signal Processing, LoopUnrolling, Parallelization, Multicluster, BWDSP100
PDF Full Text Request
Related items