Font Size: a A A

Shared Memory Simd Architecture Compiler Optimization And Structural Studies

Posted on:2007-12-24Degree:DoctorType:Dissertation
Country:ChinaCandidate:W H ZhangFull Text:PDF
GTID:1118360212984716Subject:Computer architecture
Abstract/Summary:PDF Full Text Request
Multimedia has becoming a dominating computing field. To meet such a trend, shared memory multiple-SIMD architecture is widely used in mobile multimedia processing field as a multimedia accelerator. With the consideration of power consumption and die size, most of them have the constraints of shared data bus and limited register number. Although these constraints simplify the chip design, they are the major obstacles to map the real multimedia applications to these architectures. Traditional optimization techniques can not solve such problems for shared memory multiple-SIMD architecture. Until now, to our best knowledge, there is very little research on the optimizing techniques for shared memory multiple-SIMD architecture. As a result, those architectures mainly are hand-programmed using their assembly codes. Moreover, although shared memory multiple-SIMD architecture is widely used as a multimedia accelerator, there is very little systematic research on how do the properties of shared memory multiple-SIMD architecture, such as array size, executing mode etc, have the influence on the processing ability of their host SOCs.Although shared memory multiple-SIMD architecture has been widely employed, the research on the compiling techniques for shared memory multiple-SIMD architecture are far behind the development of hardware. As a result, shared memory multiple-SIMD architectures can only be hand-programmed using their assembly codes. Programmers not only need to write assembly codes for new multimedia applications but also have to port the existing C implementation codes to shared memory multiple-SIMD architecture. These are extremely tedious tasks that inhibit the wider acceptance of the shared memory multiple-SIMD architecture. On the other hand, lacking the supporting of the compiler also impedes the further research on shared memory multiple-SIMD architecture. To our best knowledge, there is no systematic research on how do the properties of shared memory multiple-SIMD architecture, such as array size, executing mode etc, have the influence on the processing ability on their host SOCs. Therefore it is really urgent to provide some effective optimizing technique to improve the appliance area and further architecture research on shared memory multiple-SIMD architecture for real multimedia applications.Although there are multiple SIMD units in shared memory multiple-SIMDarchitecture, those SIMD units can not execute in parallelism because of the constraint of shared data bus. As a result, it is important to exploit the parallelism of shared memory multiple-SIMD architecture that how to reduce the competition of shared data bus and how to improve the utilization of shared data bus. Although traditional locality algorithms can reduce the competition of data bus to some content, they can not be applied to the optimization of shared memory multiple-SIMD architecture. The reasons are that:1. The operands of traditional locality algorithms are scalars, the operands of shared memory multiple-SIMD architecture are data vector.2. Traditional locality algorithms can only reduce the competition of data bus, the problem that how to improve the utilization of shared data bus is not considered in these algorithms.3. Traditional locality algorithms mainly focus on cache. However, there is only limited registers in each SIMD unit of shared memory multiple-SIMD architecture. How to reduce the spilled data is also very important to reduce the conflicts of shared data bus.This dissertation starts by analyzing of both opportunities and obstacles of SIMD optimization. Then, we present a compiler framework that is successful in optimizing real multimedia applications for shared memory multiple-SIMD architecture. Our all objective is to find as much parallelism as is necessary to saturate the available hardware in shared memory multiple-SIMD architecture. We reduce the competition of data bus through increasing the register locality, improve the utilization of data bus by exploiting the broadcasting property of shared data bus and solve the problem of limited register number through a resource allocation algorithm. Besides these optimizations. The framework has been implemented in Agassiz, a source to source compiler tool for C programs developed by Parallel Processing Institute, Fudan University and Department of Computer Science and Engineering, University of Minnesota at Twin-Cities. It is successful in optimizing code of real multimedia applications for shared memory multiple-SIMD architecture. As the experimental results shown, the framework lead to an average speedup by a factor of 3.62 and an average utilization of shared memory multiple-SIMD architecturee with 8 SIMD units by a factor of 56% for real applications. With the help of the compiler framework, we also do the further research on shared memory multiple-SIMD architecture. We test and analyze the influence of shared memory multiple-SIMD architecture's properties on the performance, including array size, executing mode, register number, shared data bus,topology and the instruction buffer size. Based on the experimental results, we give out some proposals to the chip design. This paper makes the following contributions:1. This paper presents a compiler framework, which is effective for mapping the real multimedia application to shared memory multiple-SIMD architecture.2. This paper gives out the systematic algorithms to solve the constraints of shared data bus and limited register number. Through these algorithms, not only the competition of shared data bus is reduced, but also the utilization of shared data bus is improved.3. This compiler framework not only automatically maps the multimedia applications to shared memory multiple-SIMD architecture, but also accelerates the research on shared memory multiple-SIMD architecture. With the help of this compiler, we do some research on shared memory multiple-SIMD architecture. According to the experimental results, we give out some proposals for the design of shared memory multiple-SIMD architecture SOCs.
Keywords/Search Tags:shared memory multiple-SIMD architecture, optimization, architecture, shared data bus
PDF Full Text Request
Related items