Font Size: a A A

Architectural Optimization Of Data Storage And Organization For SIMD Media Processors

Posted on:2008-01-26Degree:MasterType:Thesis
Country:ChinaCandidate:C Y LiuFull Text:PDF
GTID:2178360212489420Subject:Circuits and Systems
Abstract/Summary:PDF Full Text Request
SIMD media processors have been widely known and used in the area of multimedia processing because of their good programmability and high performance. However, the further enhancement of their performance will be restricted by the bottleneck resulted in by the non-computing operations, recoganized as the data storage and organization. This is mainly because the flexibility of the data access in multimedia processing and the high dependence of the SIMD technology upon the algorithm regularity. To further enhance the performance of the SIMD media processor, here we proposed some architectural optimizartion for the data organization and storage respectively.Firstly, for the data organation, here an Explicit Data Organization SIMD (EDO-SIMD) instruction set architecture is proposed to reduce the overhead resulted in by the data orgazation operations. It explicitly described the data permutation information in the instruction word and merged the data organization operation to the data computation and storage operations. An implementation of EDO-SIMD ISA based on a baseline SIMD processor is described. Simulation results show that, compared to the baseline SIMD architecture, EDO-SIMD ISA can achieve 1.34 to 1.40 speedups for the benchmark of real time H.264/AVC decoder and reduce 17.7% of the code size with only 0.49% increase in hardware area.Secondly, as it comes to the on-chip data storage, we optimized it from two aspects. One is to combine the stream access with the 2-D parallel memory system. Here we proposed an efficient 2-D stream memory system. It maps a 2-D logic space onto physical parallel memory modules. Data are interleaved in the memory in order to support both horizontal and vertical array access. Theinterleave scheme is improved for streaming access based on the previous interleave schemes. Experimental results show that, the proposed stream memory system could reduce 32.0% of the memory access rate and 25.4% of the execution cycle counts for real time H.264/AVC decoder benchmark.On the other hand, we are aimed to remove the memory redundancy of the previous linear skewing interleave scheme and try to support modulo addressing. Here, an optimized linear skewing interleave scheme is proposed. The proposed scheme can support simultaneous access of multiple subarray types of data elements in a 2-D data space with modulo addressing. 2pq (pq is the number of parallel ways) memory modules are used without redundancy to save the on-chip memory. It uses linear skewing in the horizontal direction and uses nonlinear skewing in the vertical direction. Results show that compared to previous linear skewing schemes, the proposed scheme can reduce 13.6% of the on-chip memory for cases of pq = 4 or 8 and reduce 35.5% of the external memory bandwidth for benchmark of motion estimation due to modulo addressing.
Keywords/Search Tags:SIMD Media Processor, Explict Data Organization, EDO-SIMD, Stream Access, 2-D Memory System, Interleave Scheme
PDF Full Text Request
Related items