Font Size: a A A

Research And Design Of H.264Vedio Encoder Based On CUDA

Posted on:2013-05-30Degree:MasterType:Thesis
Country:ChinaCandidate:J ZhangFull Text:PDF
GTID:2298330422479888Subject:Communication and Information System
Abstract/Summary:PDF Full Text Request
H.264/AVC is the most advanced video compression standard currently. By adopting newcoding techniques, H.264/AVC achieves higher coding efficiency and image quality. But these newtechniques produce a high computational cost and large system memory bandwidth requirements atthe same time. Real time encoding is very difficult to perform on current hardware platform due to itshigh computational complexity. On the other hand, during the recent years the processing speed ofthe graphics processing unit (GPU) is improving much more fast than the Moore’s law and it canspeed up some non-graphic applications。 The CUDA, OpenCL programming model makes thedevelopment of the application software based on GPU much simpler. At present, GPU is widely usedin astronomy, fluid mechanics, electromagnetic simulation, signal processing, video compression andmany other fields and has made great achievements.By research on CUDA programming model and H.264vedio encoding frame, this paperproposes a parallel H.264encoder implementation scheme on CUDA. In this scheme,the CPU isresponsible for initializing coding parameters, reading and writing the vedio stream file, datatransfering between CPU and GPU and scheduling and control of GPU. GPU is responsible for thecomputing tasks during the coding process such as inter-prediction coding, intra-prediction coding,integer transform and quantization, entropy coding, deblock filtering and so on. Through thereasonable task allocation, we can take full advantages of the two processors.Then we make parallel design for each module of the encoder. For the inter-prediction, we havedesigned the integer pixel motion estimation and the fractional pixel motion estimation, and proposeda new parallel implementation process for fractional pixel interpolation; for the inter-prediction, a twostage parallel implementation process is proposed; for the transform and quantization,we havedesigned parallel implementation process for fast DCT transform and Hadamard transform; for theentropy coding, we have designed parallel implementation process for each sub-block coding andcombination of macroblock encoding stream; for the deblock filtering, we have designed parallelimplementation process for boundary strength computation and filtering computation, and proposed anew macroblock execution order for deblock filtering which increases the parallel particle size.Finally, this paper chooses a variety of format video sequences to elvaluate the performacce ofthe h.264encoder. Compared to the traditional serial encoder, the parallel encoder based on CUDAdesigned in this paper speeds up16X when processing the1080P vedios with only small loss of image quality. In all, the h.264video encoder based on CUDA developed in this paper has made a highperformance improvement, and it can be applied to many realistic scenes, also can be a goodreference for the other gerneral computing tasks on GPU.
Keywords/Search Tags:H.264encoder, GPU, CUDA, speed-up ratio, parallel particle size, gerneral computing
PDF Full Text Request
Related items