Font Size: a A A

Design And Implementation Of MPEG-4Video Decoder Based On Multi-Core Architecture

Posted on:2011-09-12Degree:MasterType:Thesis
Country:ChinaCandidate:X YuFull Text:PDF
GTID:2248330392451673Subject:Circuits and Systems
Abstract/Summary:PDF Full Text Request
In this paper a brief introduction is firstly given about video codectechnology and commonly used implementation schemes as well as theiradvantages/disadvantages respectively. The features and codec procedureof a widely used video standard, MPEG-4AS Profile, are described withXvid codec model. SOC platform based on LEON3micro-processor andthe corresponding tool chains are also introduced.In this paper, Xvid software model is transplanted to LEON3platform,the percentage of CPU resource occupied by every functionblock during decoding process is given according to decodingperformance test on above system. With the design target of decodingCIF format video in time, taking all fators such as speed, bandwidth,implementation complexity, flexibility and extendibility into account,hardware/software(S/H) co-design method and detailed partition schemeare determined based on computing and analysis. The traditional S/H co-design method that CPU and hardwareacceleration blocks are connected via system bus to communicate, willbring high pressure on system bus bandwidth and is not suit for theapplication as video decoding that has tough demand on data transfer. Inorder to resolve this problem, the architecture of CPU andmulti-coprocessors is proposed in this paper, coprocessors could direclyaccess off-chip memory rather than via system bus. Application specificmodule of coprocessor controller CPC is designed with the principle of“Single Instruction, Multiple Data”, picture level S/H co-design isimplemented between CPU and CPC using coprocessor isntruction whilemacro block level pipelined decoding of two coprocessors (i.e., IDCT-CPand MP-CP) are processed under the control of internal command sent byCPC after decoding the coprocessor instructions. In this paper, thearchitecture of decoder, decoding procedure of S/H co-design as well asdata/control flow on system level is described in detail. A scheme“two-plane storage”, which is used for storing decoded result of referencepicture and could significantly enhance the usage efficiency of storagebandwidth, is proposed after analysis and evaluation of system storagebandwith needed.On the basis of architecture scheme, detailed design andimplementation are carried on the aspects of software, hareware and S/Hco-design. On software side, the blocks with decoding function are further optimized on code struction, data interface and algorithms, theperformance could be improved by46.69%afterwards. On hardware side,comprehensive design and implementation schemes of IDCT-CP andMP-CP are addressed on function, operation procedure, interface signalsand timing, bandwith performance analysis, command word definition aswell as detailed implementation of every sub-module. On S/H co-designside, system control softeware is designed to overall control the wholedecoding process by the method of directing coprocessors withcoprocessor instructions; a dedicated block CPC, which behaves as aninterface, is designed to coordinate and communicate between softwareand hardware.During the design process, reuseability and extendibility are alwaysimportant factors taken into account in this paper. Designing theindependent coprocessor controller CPC with separate detailedcoprocessor functions from control software so that the decoder is mucheasier to transplant to other platforms. MP-CP has three independentoperation modes: normal mode, bypass mode and software mode, whichhave different command words and S/H partitions to apply to differentapplications with different features. In the process of motion prediction,dedicated Cache with the replace algorithm “Distance Flag” is designedto aim to the feature of motion prediction, with it the amount of usage ofstorage bandwidth during motion prediction will be decreased by about 20%according to the test result. A unified off-chip memory interfaceblock MCI is designed to provide all of the modules that need transferdata with off-chip memories the standard access interface and timingprotocol, that could improve the extendibility of the system.At last, the decoder implemented with Verilog HDL is integratedinto SOC platform, and is taken a series of function verifications andperformance tests. According to the test result, all the funtions of thedecoder are correct, the decoding speed for CIF format video steam couldbe up to60frames per second at the clock frequency80MHz,, and itsperformance is about4-6times compared to the status before hardwareaccelerating, that fully meets the scheduled perfanmance target.Meanwhile, the design parameters such as speed, area and power arederived from the logic synthesis using130nm target library. The designeddecoder in this paper has an excellent performance/cost ratio compared tothe reference designs.
Keywords/Search Tags:MPEG-4, Video Decoding, Software/HardwareCo-design, Co-processor
PDF Full Text Request
Related items