Research And Implementation Of Software Pipelining Framework For BWDSP104X

Posted on:2017-05-21

Degree:Master

Type:Thesis

Country:China

Candidate:L T Hong

Full Text:PDF

GTID:2308330485951848

Subject:Computer software and theory

Abstract/Summary:

PDF Full Text Request

Most modern high-performance digital signal processor user VLIW architecture, which can launch multiple instructions in the same clock cycle, aims to obtain higher performance and discover instruction-level parallelism of the target machine. Programs can take advantage of the processor hardware resources through compiler optimization which make challenges to backend optimization of compiler. Loop module is generally the most time-costing in program, which make loop optimization so important to improve performance, including vectorization, loop unrolling, predicated optimization, software pipeline and so on.Our research based on the compiler named Open64, which is an open-source project with GNU license. The compiler is a perfect research platform because it has clear code modules and comprehensive backend optimized design. Currently Open64 has achieved software pipelining optimization for the target of IA64, which has a detailed overall process and is valuable to our research.The main task of the thesis is the implementation of software pipelining optimization for BWDSP104X on Open64 platform. We start with loop selection, too small or irregular loop will not be selected to perform software pipelining optimization. Secondly, calculate the minimum initiation interval according to the resource dependence based on machine description and data dependence calculated by data dependence graph. Thirdly, modulo scheduling algorithm can be started through initiation interval and modulo resource table. Lastly, the BWDSP104X object code can be obtained by modulo variable expansion and register allocation. For the purpose of getting maximize performance of software pipeline we should take full advantage of multi-cluster resources. So the thesis puts forward instruction clustering, instruction scheduling, and software pipeline with multi-cluster architecture. But the performance will still be affected if branch statement exists in loop. There have predicated instructions and registers in our processor, which provide hardware support to the predicated optimization. Experimental results show that multi-cluster software pipeline combined with predicated optimization produce better performance improvement to loop program running on BWDSP104X multi-cluster processor.

Keywords/Search Tags:

Compiler, Software pipelining, Iteration interval, Module-Scheduling, Register allocation, Multi-Clustcr optimization, Predicate optimization

PDF Full Text Request

Related items

1	Research On Software Pipelining Based On GCC
2	Research Of Software Pipelining Framework Based On BWDSP
3	The Research And Implementation Of Predicated Execution
4	Automatically constructing compiler optimization heuristics using supervised learning
5	Compiler-based Improvements To Register Allocation Strategies
6	Predicate Compiler Technology And Deep Code Optimization
7	Research On Software Pipelining Techniques For EPIC Architectures
8	Research Of Automatic Compiler Tuning Base On Machine Learning
9	Research On Some Key Compiler Techniques For Embedded Processors
10	The Key Technology Research, Instruction-level Parallelism Compiled