Font Size: a A A

Research For Micro-Architecture Optimization On Media DSP IP Core

Posted on:2012-07-27Degree:DoctorType:Dissertation
Country:ChinaCandidate:W G CaiFull Text:PDF
GTID:1118330332483545Subject:Communication and Information System
Abstract/Summary:PDF Full Text Request
With the development of integrated circuit technology and information processing technology, the microprocessor design has become one of today's research focuses. The progress improvement and application's increasing demand brings significant impact for microprocessor architecture and design methodology. Especially in embedded field, high performance, low power consumption, rich software support and a relatively short verification time are important for embedded processor design.The author participated in the project of the media digital signal processor IP core MediaDSP64, which is developed by the SoC R&D Group of Zhejiang University. As part of the research results, this thesis focuses on the processor design and structure optimization. On the basis of binary compatibility for kernel instruction set, our work includes three aspects. First is the application oriented instruction set configuration, second is the pipeline optimization for data path and control path, and third is the multi-issue out-of-order feature implementation in a complex DSP processor.Instruction set configuration is classified into two types, one is to design a whole instruction set for target applications, and the other is to design several special instructions for a specific algorithm. With media processing as an example, the former is demonstrated by SIMD instruction optimization. Besides extending its operation width, memory access and execution unit are co-optimized, which reduces SIMD requirements in data arrangement and bitwidth preprocess. The latter is illustrated by optimization for bit-streaming processing. According to the pipeline structure, several serial operations in a loop can be integrated into a single instruction, which greatly improves process efficiency and code density.Based on the analysis of complex DSP instruction execution, a distributed data forwarding structure is constructed and adaptive backup register mechanism is designed to solve the data missing problem. An early write-back strategy is proposed to reduce data sources in the forwarding network, and shadow register mechanism is adopted to guarantee precise exception. For control path, immediate detection algorithm is replaced by early detection algorithm, in which the timing delay of detection circuit is hidden from the processor critical path. Through these improvements, the DSP can work at 400 MHz under TSMC 130 nm (Generic and Worst Case) technology.A multi-issue out-of-order DSP named as MD64SS is developed to effectively exploit instruction level parallelism. Instruction compatibility is realized by partitioning complex DSP instruction into several micro-instructions, and coloring algorithm is adopted to insure atomic commitment of complex DSP instructions. Just-in-time decoding technology is proposed to implement instruction encapsulation, which can avoid large modification when adding new sub-pipelines or new instructions. By combining the feature of register broadcast and instruction counter mechanism, the critical path of wakeup logic is divided into two parts, which reduces the timing delay of issue unit. Performance estimation shows 50%-80% improvement when compared with the original design. The DSP can work at 620 MHz under TSMC 130 nm Generic technology and 1030 MHz under TSMC 90 nm Fast technology.
Keywords/Search Tags:Media Processor, Instruction Configuration, Data Forwarding, Microarchitecture, Superscalar
PDF Full Text Request
Related items