Font Size: a A A

Research On Key Techniques Of H.264 Video Compression Coding On The Stream Architecture

Posted on:2010-01-08Degree:DoctorType:Dissertation
Country:ChinaCandidate:H Y LiFull Text:PDF
GTID:1118360305982693Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
Stream architecture is a novel architecture that is defined first by Professor William J. Dally's research group of Stanford University. The stream processor is a good choice for customized accelerators of high productivity computing systems, due to its powerful computing capability, low power and flexible programmability. Stream arichitecture makes use of arithmetic intensity, abundant parallelism, multi-level data reference locality and predictable memory access patterns commonly existing in media applications, to achieve efficient implementation and performance acceleration.Digital video is a broad class of computation-intensive applications in media processing domain, widely used in video conferencing, high definition television, satellite broadcast and so on. As a kind of key video processing technology, video compression covers large amount of complicated computation. Thus, the performance of video compression exerts crucial influence on many media applications especially including internet video business, portable video device and high definition television. Current video compression technology usually consists of complex mathematical transformation and signal processing operations. As a result, it needs high performance and sustained throughput for real time video coding. The state-of-the-art gerenal-purpose processors, especially those which have analogous x86 architecture, are not able to satisfy the requirements of high performance video coding applications. The solution of real-time video collection and compression is relying on special-purpose accelerated components. However, these accelerators lack of enough flexibility to deal with various video processing applications.Video compression coding exhibits typical program characteristics for stream processing, so stream architecture is a good candidate for high performance video applications. With the enhanced image resolution and the increasing requirement for compression ratio and real time ability, the algorithms of video coding are becoming more and more complex. How is video codec mapped on the stream architecture? How is the stream program implemented efficiently for encoding and decoding? How is video data collected, transmitted, processed and stored in terms of stream? To solve these problems is the key for streaming implementation of video encoder and decoder on the stream architecture.For H.264 video coding standard with high computational complexity, this paper tries to map the encoding process onto stream processor, in which multimode and adaptability of coding algorithm are significant challenges when implemented in the stream computing model. This paper studies the transplantation and streamization methods of high-definition video encoder in H.264 based on stream architecture, especially the key techniques of algorithm, design and implementation of streaming scheme. After that, the performance evaluation is carried out according to the experimental results, and some suggestions are made for coding optimization. In addition, this paper discusses a stream-based multi-standard transform coder, profiting from programmability and scability of stream processors. The main contributions of this paper are summarized as follows:(i) A streaming framework is proposed for H.264 video encoder based on stream computing model. Streamization, meaning that mapping on the stream architecture, is a process that uses stream-kernel programming model and achieves performance acceleration through the separation of computation and memory, parallel computation execution and sequential memory access. According to various program characteristics represented in H.264 encoding, regular and irregular properties of different modules are described in the proposed streaming framework. The granularity of parallelism in our framework is block or macroblock, avoiding the pixel dependence and strengthening the efficiency of parallel execution. Our framework can afford the real-time encoding processing for high definition video with the 720p format. The results show that stream processor can provide high performance for complex video compression coding, because it supports large amount of arithmetic units, unique memory hierarchy, data parallelism and locality.(ii) A series of stream algorithms are proposed for H.264 video compression coding mapped onto stream processors. Stream algorithms are designed for applications that share large stream of data, little global data reuse and data single-directional transfer, before they are implemented based on stream computing model. Using the platform of Imagine stream processor and graphics processing unit, several stream algorithms are proposed, including Interleaved Streaming Transform (IST), Strip-mined Multi-group Intra Prediction (SmMgIP), Recursive Indexed-loading Transform (RIT) and Row-wise Zonal-loading Transform (RZT). The former three algorithms are implemented on Imagine, while the fourth one is implemented on graphics processing unit. These stream algorithms can exploit program characteristics and explore efficient streaming implementations. They are the foundation of our H.264 encoding streaming framework. Additionally, the design idea of stream algorithms can be adapted to other data-parallel SIMD computing platforms well.(iii) Irregular stream computing, memory and control models are proposed and general optimization approaches are gived for kernel level program and stream level program respectively. The innovation of irregular stream models comes from inter prediction and deblocking filter in H.264 encoder. As a supplyment of typical stream programming model, irregular stream models provides an abstraction and a solution for complex stream applications.Based on loop optimization, two kinds of kernel optimization techniques are proposed including kernel fussion and kernel fission. Kernel fusion can enhance intensive computation, improve the ALU utilization, and reduce the kernel executing time, in order to increase computing amount in unit time (for example, in integer transform). Kernel fission is used to deal with control-intensive kernel, and can relieve the pressure produced by lots of branch operations (for example, in deblocking filter).For stream level program, when the input stream length is larger than the size of global stream register file, the spilling stream will be written back to off-chip memory. The overhead of spilling stream is expensive comparing with kernel computation. Thus, stream sectioning method is proposed for too long stream. It divides a long stream into several substreams, and at the same time adds an outer loop through these substreams. Stream sectioning can avoid stream spilling during computation for large scale of data and make parallel execution of computation and memory operations, in order to improve the computing efficiency of stream processor.(iv) A streaming algorithmic design for multi-standard transform coding is proposed. Current video encoders for popular coding standards have a similar coding structure with some different compression tools. For chip design, the share of hardware resources is expected to reduce cost for adaptability to multiple standards. Block-based discrete cosine transform, integer cosine transform and Walsh-Hadamard transform, and frame-based discrete wavelet transform are mapped on Imagine and graphics processing unit. Experimental results indicate that stream processing can offer an efficient solution for multi-standard transform coding algorithms, and achieve higher performance than other programmable computing platforms.
Keywords/Search Tags:Stream Architecture, Imagine, Stream Processing, Stream Application, H.264, Video Compression, Video Coding, Kernel Optimization
PDF Full Text Request
Related items