Font Size: a A A

Study On Local Design Optimization Of High Performance DSP's On-Chip Storage System

Posted on:2005-05-24Degree:MasterType:Thesis
Country:ChinaCandidate:D Y ZhangFull Text:PDF
GTID:2168360155971832Subject:Electronic Science and Technology
Abstract/Summary:PDF Full Text Request
Nowadays the Digital signal processing (DSP) has got great development and been widely used in many fields. The growing demand of DSP brings its storage system enormous challenge and opportunity never met before. Although the technology of Cache has been widely used in the design of GPP, whether it is appropriate for DSP or not is still in disputes, for the non-predictability and relatively huge latency caused by cache misses cannot meet the rigorous requirements of real time processing sometimes. But after all, Cache is a trend in DSP design. To design and realize an appropriate DSP's storage system based with Cache is the main task for my master's degree.The high performance YHFT_Dx that we designed independently has adopted "on-chip RAM and two level Cache" as its storage system structure, which has well expansibility. But the large latency caused by cache misses still needs reduce. This paper has given several measures to optimize the storage system performance of YHFT_D3, and put it into reality in succession.We Simplify the miss data path coming into internal chip by rebuilding the L2 and EMIF components of YHFT_D3 and redesigning the interface logic of L2 and EMIF.After that, L2 can ask EMIF for data directly without passing though EDMA when it has misses. Furthermore, such a measure can increase burst length of EMIF when it transport data to L2 to shorten the time of data staying in mid-transporting state, which largely reduce the latency of L2's miss latency.The Load/Store instruction has large proportion and good parallelism in VLIW structure. According to this character we bring forward the concept of "miss pipeline" which begins to work when L1D encounters misses.using this miss pipeline, a single read miss latency can be reduced from 9 cycles to 6 cycles and multi-read miss's averal latency can be reduced to 3~4 cycles.Use the Design Compiler, in worst case operation condition of 0.18μm technology artisan standard cells, the componets rebuilded satisfies both in timing and area with the frequency above 150 MHz, and area is not obviously increased constracted with YHFT-D3. By carrying out simulation in different design level, it can be guaranteed that the design could work correctly.At last, the conclusion has been given by running a group of benchmarks to analyze the performance enhanced after the design being optimized.
Keywords/Search Tags:Data Cache, DSP, Miss Pipeline, Miss latency, Design Verification
PDF Full Text Request
Related items