Font Size: a A A

Research On Acceleration Mechanisms Of Custom Instruction And Coprocessor

Posted on:2010-04-04Degree:DoctorType:Dissertation
Country:ChinaCandidate:X L DuFull Text:PDF
GTID:1118360302963023Subject:Microelectronics and Solid State Electronics
Abstract/Summary:PDF Full Text Request
As the microprossor's performance improved, the application filed of embedded system is being enlarged. The application's complexity has been the key problem for the performance of embedded system. During the digital meida, Motor Electric, Mobile TV and intellectual mobile phone's fields, the demand changes so frequently that it is hard for the microprocessor to meet the performance's challenge. Currently main acceleration mechanisms include multi-processor acceleration, ASIC acceleration, custom instruction based on configurable CPU, coprocessor (FPGA/DSP) acceleration etc. The purpose of the thesis is to analyze and research the characteristics of the acceleration mechanisms, explore the develop direction of the acceleration.The dissertation focuses on the realization and verification of custom instruction and coprocessor acceleration, mainly on the research of design methodology of configurable processor, design method of custom instruction and coprocessr, implementation of two acceration mechnism and the comparison of their performances, verification method for SOC which centered on processor.The main research includes:1 Cooperate with members and finished the design and verification of project - "H.264 coder/decoder SOC - VF1000", responsible for generation of configurable processor and design of custom instruction. The SOC was based on multiple configurable processors and some hardware acceleration modules, realized video code/decode with 30fps for VGA. The system passed function test on DOPOD mobile and HP PDA. Through analyzed the bottlenecks of the algorithm with ARC's Metaware, generated the CPU with Architect2, and implemented the hardware descriptions of the custom instruction to accelerate the system.2 Analyzed and researched the design methdology and architecture based on configurable processor, deeply researched the transform and quantization algorithm. One optimized design methdology for configurable processor has been proposed. The impacts of implementation of custom instruction for performance has been researched, with optimizing schedule algorithm and reducing key path's method, implemented the transform and inverse transform, quantization and inverse quantization. The design has been implemented in UMC's 0.13um CMOS process, the test with JVT bitstream demonstrated that the design can get real time performance with 200MHz frequency. The design method was applied into Microsoft Research Asia's project of storage system acceleration successfully.3 Cooperated with members and implemented the design and verification of high performance floating point coprocessor, responsible for architecture design and verification of coprocessor. Researched the system level model design method based on SystemC, one design method with SystemC, Verilog HDL and VHDL languages for system level model was proposed, implemented one abstracted level model for VFP-A. VFP-A communicated with ARM's coprocessor interface, compliance with VFP11 instrucntion set. One new method for multiplication round implementation and registet file controlling was proposed. Through combining the single and double precision multiplication round algorithm; closely implement the partial product decoder and partial product compression, one high speed pipeline multiplication implemented. Druing the register file control, through priority the three pipelines, the pipeline with highest priority was allowed to access the register file, otherwise the pipeline with lower priority woule write the data into buffer. When there is data valid in buffer, the data would be written into register file and the data from the pipeline was written into buffer. If there are multiple pipelines which would writte data, and there is no enough space for the data, then the pipeline with lower priority would stall until there is space for the data. The method reduced the power and the area of the implementation of register file. VFP-A's frequency can got 600MHz in 90nm CMOS process, one high performance and low cost floating point coprocessor was implemented.4 Compared the characteristics of code coverage and function coverage driven menthods, one method for design test case with combining the code coverage and function coverage, the vefication IP model selection based on verification purpose and the accurate demand was proposed. During the unit test stage, the interface timing and inner fucntion are verificationed with white-box verification, using the code coverage as the extent of verification. Through analying the coverage, supplement the test case to improve the efficiency of verification. During the integer test stage, bus function model chosed to replace the hardware IP; the real logic implementation of the module was overleaped. During the system verification stage, the functions of the module should be integrated into verification characteristic set, the design simulation model with lower abstract level description which support cycle accurate level analysis was selected, the accuracy and flexibility were ensured. The design plan improved the efficiency, flexibilty and portability. The method of design and choose verificatin IP can give high generality.
Keywords/Search Tags:Custom instruction, Configurable processor, Coprocessor, Acceleration Mechanism, Function Verification
PDF Full Text Request
Related items