Research And Implementation Of H.264 Parallel Encoder Based On CUDA

Posted on:2011-01-06

Degree:Master

Type:Thesis

Country:China

Candidate:H Y Su

Full Text:PDF

GTID:2178330338490135

Subject:Computer Science and Technology

Abstract/Summary:

PDF Full Text Request

As the most popular video coding standard, H.264/AVC has attracted academia and business circles for its high image quality and high compression ratio. However, H.264 requires large and intensive computation, the existing serial encoder based on general-purpose processor can not meet the needs of real-time encoding for full HD video, while dedicated hardware encoder is also less than satisfactory with the inflexibility, long development time and high cost. Thus, it is eager to find an efficient implementation for H.264 coding. With the rapid growth of graphics processing unit (GPU), it achieves a great progress in compute capability and bandwidth, and using GPU to speeduping applocation becomes one of the hotspots. CUDA and OpenCL improve the flexible of programmable of GPU. As a result, this paper focuses on research and implementation of H.264 parallel encoder based on CUDA.Based on the streaming H.264 encoder, this paper proposes a parallel encoder which is more fit with the characteristics of CUDA framework. Unlike other H.264 encoder based on GPU, this article is not focusing on one of components of H.264 encoder, but mapping the entire H.264 encoder to the CUDA architecture. This paper design the parallel computing model and storage model for various modules of H.264 encoder based on CUDA, at the same time we optimized the encoder in various aspect.Finally, choosing 1080P video as input we evaluation the performance of the encoder propesed by this paper, experimental results show that our encoder achieves significant speedup over the reference encoder, the speedup of our encoder running on GeForce GTX260 is up to 18. Comparing with other encoders based on GPU, the performance of our encoder also excels to them. In order to evaluate the contribution of components to the entire encoder and the bottlenecks, we also assess the speedup of various parts of the encoder, results show that inter coding achives the best performance and speedup is about 25, while intra coding obtains the worse one and speedup is about 4. The time spending on data transfer between CPU and GPU occupys 25% of the whole GPU time, which is one of the bottlenecks. The parallel model for each components proposed by this paper and the analysis of performance provide insights into other applications based on GPU.

Keywords/Search Tags:

H.264 encoder, CUDA, GPU, parallel encoder

PDF Full Text Request

Related items

1	Research And Implementation Of H.264 Parallel Encoder Based On Cuda
2	Research And Design Of H.264Vedio Encoder Based On CUDA
3	Parallel Design Of JPEG-LS Encoder Based On CUDA
4	Based On Cpu + Gpu In H. 264 Encoder Parallel Code Design
5	Design And Implementation Of HEVC Parallel Encoder Based On CPU+GPU Platform
6	An Efficient Implementation Of H.264/AVC Encoder Based On GPU
7	The Design And Implementation Of Parallel BCH Encoder And Decoder
8	The Design And Implement Of Parallel MPEG-4 Video Encoder Based On MDSP
9	Research On Key Technologies At The Encoder In Distributed Video Coding
10	Software Design Of Multi-channel H.264 Encoder Based On DM6467 Processor