Font Size: a A A

Research And Implementation Of H.264 Parallel Algorithm Based On GPU

Posted on:2019-05-03Degree:MasterType:Thesis
Country:ChinaCandidate:S K WangFull Text:PDF
GTID:2348330569995735Subject:Engineering
Abstract/Summary:PDF Full Text Request
As a high-powered video coding standard,H264 is currently one of the most popular coding standards due to its high compression ratios,nevertheless,its complexity is much higher than the others which leads to more consuming time.With the continuous development of the industrial field,GPU has gradually evolved from merely graphics rendering to other information data processing areas owing to its powerful floating point computing capacity and its parallel computing characteristics.With the release of CUDA in 2007,researchers can develop and use GPU more easily.Therefore,there will be a broader application prospect if GPU is used for video decoding.This paper adopts the concurrent collaboration between CPU and GPU to design a heterogeneous cooperative parallel scheme for the entire decoder processing process,which is implemented on CUDA.Modules with Parallel Computing functions are distributed to GPU processors according to its parallel computing characteristic;while master-slave dual-thread scheduling is also used due to the multi-thread scheduling characteristics of CPU,where the Child Thread deals with weaker parallel operable tasks such as bitstream,parsing or reordering.Meanwhile,the main thread is mainly used for data and GPU scheduling,and the image reconstruction through GPU.This paper focuses on five decoding modules including IDCT,IQ,intra/inter prediction,and loop filtering,which realizes a reasonable parallel algorithm for inverse quantization(IQ)and Inter-prediction;proposes its parallelization and optimization combined with fast butterfly operation in allusion to IDCT module;presents a regional parallelization design scheme for intra-frame prediction combined with its data correlation;prospects the parallelization and the optimization of the parallel algorithm and the boundary filter algorithm for the filter intensity value with the guarantee of loop filtering function performance.A large number of experiments are taken by using CUDA programming model for parallel algorithm completion,and the experimental data shows that compared with serial FFmpeg decoder,this parallel designed decoder in this paper can significantly improve the decoding efficiency of H.264,and greatly reduce the CPU utilization.
Keywords/Search Tags:GPU, CUDA, H.264 parallel computing, intra prediction
PDF Full Text Request
Related items