Conception de processeurs specialises pour le traitement video en temps reel par filtre local

Posted on:2011-12-20

Degree:M.Sc.A

Type:Thesis

University:Ecole Polytechnique, Montreal (Canada)

Candidate:Aubertin, Philippe

Full Text:PDF

GTID:2448390002969226

Subject:Engineering

Abstract/Summary:

This master thesis explores the possibilities offered by Application-Specific Instruction-Set Processors (ASIP) for digital video applications, more specifically for a particular algorithm class used for video processing: local neighbourhood functions. For this algorithm class, an architectural exploration lead to the identification of a set of design techniques which, together, form a coherent and systematic approach for the design of high performance ASIPs usable for real-time video processing.;In order to demonstrate the validity of the proposed design approach experimentally, seven ASIPs have been designed by extending the instruction-set of a configurable and extensible processor. Three of the ASIPs implement intra-field deinterlacing algorithms, and four implement the 2D convolution with different kernel sizes.;The results show a significant improvement in performance. For the intra-field deinterlacing algorithms, speedup factors are between 95 and 1330, while the factors of improvement of the Area-Time (AT) product are between 29 and 243, all this compared to a pure software implementation running on a general-purpose processor. In the case of the two-dimensional convolution, speedup factors are between 36 and 80, while factors of improvement of the AT product are between 12 and 22. In all cases, real-time processing of high definition video in the 1080i (deinterlacing) or 1080p (convolution) format is possible given a 130 nm manufacturing process.;The proposed design approach aims at an efficient utilization of available bandwidth to memory, which constitutes the main performance bottleneck of the application. It is possible to approach the processing speed limit imposed by this bottleneck through an appropriate data reuse strategy and by exploiting the data parallelism inherent to the target algorithm class. The design approach comprises four steps: first, a Single Instruction Multiple Data (SIMD) instruction which calculates more than one pixel in parallel is created. Then, shift registers, which are used for intra-line input pixel reuse, are added. Next, a processing pipeline is created by the addition of application-specific registers. Finally, the custom load/store instructions are created. Some of these steps lead to possible hardware simplifications for some algorithms of the target class. The hardware structure thus obtained, together with the instruction-level parallelism made possible through the use of a Very Long Instruction Word (VLIW) architecture, mimics a pipelined systolic array.

Keywords/Search Tags:

Video, Instruction, Possible

Related items

1	Comprehensive Video-Module Instruction an Alternative for Teaching IUD Insertion to Family Medicine Residents
2	The effect of supplemental video instruction on aviation student performance
3	Study On Application Specific Instruction Set Processor For Video Coding And Its VLSI Implementations
4	Research On Instruction Set And Design Of Data Path For Application Specific Video Processor
5	Instruction-flow Scheduling Mechanism For High-performance SIMD DSP
6	Research And Implementation Of Instruction Decoder Verification And Vectorization Compiling For BWDSP
7	Optimization Of A MPEG-2 Video Decoder Using The Multimedia Instruction Set Of Godson-2
8	The Design And Implement Of Instruction Decode&Control Unit In FT-C55LP
9	Design Of Configurable And Extensible Media Processor
10	Research On Virtual Instruction Translation Technique And Implementation Of Translator