Font Size: a A A

Research Of High Performance, Low Power Design For VLIW Architecture Digital Signal Processor: Prototype, Algorithm & Implementation

Posted on:2008-03-08Degree:DoctorType:Dissertation
Country:ChinaCandidate:Y F LiFull Text:PDF
GTID:1118360242976097Subject:Computer system architecture
Abstract/Summary:PDF Full Text Request
The mobile computing and digital multimedia are becoming the trend of the ageof post-PC. It makes the digital circuits more and more rely on specific applications.Facing the requirement of a large quantity data and computation, application engineersfind that the various processor could not meet their design constrains. It is hard tomeet the time-to-market as well as the design schedule. To solve this problem, in thearea of low power design, people initiate a lot of research work on gated clock. Itsbasic idea is to minimize the dynamic power consumption by eliminating unnecessarytrigger of the circuit. Various techniques have been proposed in the past few years.Main efforts are focused on automatic insertion of gated logic block after the analysisof the net list. With the increment of the chip scale and the power consumption, morecomplex problems appear when in testing. That may lead to permanent damage to thecircuit under test due to high current density and overheating. Besides, the high rate ofcurrent in power and ground rail may cause excessive power/ground noise, which candisturb the normal operations of the circuit causing some good dies to fail the test andresulting in manufacturing loss. Low DFT methodology becomes the highlight of lowpower research. New design method and ?ow should be proposed. To be feasible andpractical, it should make least changes to the current design ?ow and should be at theminimum cost.Firstly, the basic idea and main research topics are introduced. Then we focuson the four key issues and propose the solutions. They are the prototype of VLIWDSP, the gated micro-architecture for low power consumption, the low power testingtechnology and design methodology. We implemented Ares—a low power prototypebased on the TSMC 0.18um Generic Process. The results of the simulation showedthe validity of our methodology. We gained significant power reduction at low costof extra circuit. With the standard video coding, communication and storage system,we extended the instruction set at very low cost. The micro-architectures presentedhere helped to increase the performance of the DSP significantly. In order to show thevalidity of the partition based DFT ?ow, we did the simulation on an ARM9-based SOCwhich was implemented by TSMC 0.18um Generic Process. Results shewed that in thetesting stage, we got good yield and high level fault coverage. The main contributions of the thesis can be summarized as follows:A virtual prototype of VLIW DSP is proposed, which features RISC architecture.This prototype can be seen as a parallel processor with multiple RISC micro-architectures. This VLIW architecture tries to map an existing VLIW instructionset into a static supper-scaler structure, that is composed of some extendableRISC micro-architecture. Based on this prototype, we implemented a 16-bit in-struction set VLIW structure named Ares.Based on Ares DSP core, we proposed a solution to minimize the dynamic powerconsumption of the pipeline, which focused on the elimination of useless trigger.A predictive algorithm is proposed based on the resource con?ict detection. Weoptimized the pipelines for every stand-alone channel of Ares and we presentedthe hierarchy of the total clock distribution of Ares. Simualtion results indicatedthat the algorithm brings significant dynamic power reduction at a considerablelow cost of the circuit.In the field of video coding and the error correction, several low cost architecturesare proposed to enhance the DSP instruction set. A low-cost high-performanceSAD architecture, a high-performance AVS inverse integer transform architec-ture and a high-density error-correcting architecture for ultra-long BCH Codesare presented.For the power consumption in the area of Design-for-Test, we proposed apartition-based method and EDA design ?ow by partitioning the chip into multi-ple independent clock domains. This ?ow is applied to the mass production andtesting. Moreover, in order to reduce the cost of the test circuit introduced by thepartition, to preserve the fault coverage, to produce more"reasonable"partition,an auto-partitioned algorithm is proposed.
Keywords/Search Tags:VLIW, DSP, micro-architecture, Low power, Design for Test, Clock gate, prototype
PDF Full Text Request
Related items