Font Size: a A A

The Simd Compiler Optimization Methods Research

Posted on:2006-01-06Degree:DoctorType:Dissertation
Country:ChinaCandidate:J H ZhuFull Text:PDF
GTID:1118360212484472Subject:Computer system architecture
Abstract/Summary:PDF Full Text Request
Since multimedia has become a dominating computing field, to meet such a trend, almost all general purposed processor venders have integrated multimedia extensions (MME) to their processors. Due to the potential parallelism and the low caculative precision requirement of multimedia applications most MME are implemented with Single Instruction Multi Data instruction sets.Currently, programmers are mainly restricted to using assember to utilize these MMEs with in-lining assembly codes or intrinsic functions. With these methods, the development become extremely inefficient and the code would be hard to be transplanted between different platforms. An alternative way is to make compiler automatically generate SIMD instructions from the code of standard high level programming languages.Although SIMD optimization is a part of vectorization, the traditional vectorization technique could not be simply transplanted to SIMD optimization due to the differences between vector processor and SIMD architecture. Currently, there is only few compilers could speedup some individual multimedia applications.Based on the deep study to the SIMD architecture and widely analysis to the multimedia workload. We carried on a series research to develop efficient SIMD optimization techniques, and to find out some useful techniques in this area. Meanwhile, we implemented these techniques with open source compiler Gcc3.5 and parallelization research platform Aggassiz as well. The work of this dissertation includes:1. This dissertation thoroughly investigates the present research on SIMD architecture, multimedia workload, and SIMD compilation techniques. One conclusion is that there are obvious gaps between SIMD compilation technique and traditional vectorization techniques2. To meet the requirement of SIMD compiler, we introduced "valid bits analysis", a dual direct data flow analysis, as the basic analysis for further optimizations, to control the band-width and enhance the parallelism.3. Since there are strong conflicts between the pack arithmetic of SIMD architecture and Integer Promotion rule of programming language standard, we introduced the overflow controlled SIMD arithmetic to optimize the related code.4. We successfully vectorize the linear recursive arithmetic in saturated mode to avoid exchanging the data between serial components and SIMD components. So that we can enhance the performance of related code. Meanwhile, we proposed the method how to discover Cyc-Constant, with this method more potential parallelism are developed.5. We also implemented two prototype system with open source compiler Gcc3.5 and parallelization research platform Aggassiz, and verified the correctness for the SIMD optimization techniques and measure the performance improvement as well.Conclusively, to enhance the capacity of SIMD optimization we discussed several critical techniques, such as how to performe highly accurately data bit width analysis and how to develop potential parallelism in saturation arithmetic mode. On the other hand, we verified the correctness and analyzed the performance improvement for the techniques.
Keywords/Search Tags:Multimedia Application, Single Instruction Multi Data Instruction Set, SIMD Optimization
PDF Full Text Request
Related items