Font Size: a A A

Design And Implementation Of The Scalar And Vector Scratch Pad Memory On GX64-DSP Chip

Posted on:2016-12-15Degree:MasterType:Thesis
Country:ChinaCandidate:X LiuFull Text:PDF
GTID:2348330509960530Subject:Software engineering
Abstract/Summary:PDF Full Text Request
The rapid growth of DSP computing capability out of proportion to the slow development of memory performance. This causes the speed of memory and DSP core difference from one or more orders of magnitude. In Cache based hierarchical storage scheme, the delay caused by Cache missing cannot be ignored in DSP with high real-time requirements. In order to reach a high hit rate, the design complexity and power consumption of Cache growing ever fast. However, both power and area of Scratch Pad Memory(SPM) are 30% to 40% low than Cache. Besides, Cache missing never be happen with SPM solution, which bringing significant advantage for high real-time requirement DSP. At the background of GX64-DSP, the purpose of this paper is to implement the on-chip VSPM(Vector Scratch Pad Memory)and SSPM(Scalar Scratch Pad Memory)of GX64-DSP. The main tasks and innovation points are as follows:1. Designed a vector and scalar instruction set supports both multi-granularity access and addressing modes. And also proposes a reorder instruction for accelerating the FFT-algorithm.2. VSPM supports double vector Load/Store instructions and parallel DMA Read-Write operation with low collision rates. The access bandwidth can reach to 2048 bit and 512 bit respectively, providing high access bandwidth for vector computing unit with SIMD architecture. The bank with hi-lo address cross-organization is adopted to decrease the rate of the conflict. In addition, VSPM also supports non-granularity and inter-row access. Furthermore, it use 16-path data reorder to accelerate the FFT-algorithm.3. SSPM supports single vector Load/Store instructions and parallel DMA Read-Write operation with low collision rates, and there access bandwidth can reach to 256 bits and 512 bits respectively?It can replace Cache on the function with a normal access mechanism of DMA during data transmission. This mechanism has a higher performance and lower control logic overhead than Cache.4. Does a detail function verification for VSPM and SSPM using assemble incentives. The results show that we have obtained 100% coverage, and actually with no functional problem. Then implement the logic synthesis for the design under 40 nm technology. The operating frequency is up to 1GHz which meet the design requirements.
Keywords/Search Tags:SPM, DSP, SIMD, Access Confliction, DMA, Arbitration
PDF Full Text Request
Related items