Font Size: a A A

Design Of H.264 Parallel Encoding Algorithms And Implementation On GPU

Posted on:2012-06-16Degree:MasterType:Thesis
Country:ChinaCandidate:Y ZhaoFull Text:PDF
GTID:2218330368988233Subject:Circuits and Systems
Abstract/Summary:PDF Full Text Request
Video is the most abundand information carrier. Because of its huge amount of information, video compression technology has been a research project all the time. As the most popular video coding standard, H.264 can achieve incomparable compression ratio; However, the high compression ratio is provided at the cost of the high encoding complexity and more elapsed time. Tremendous computation becomes a major factor of influencing encoding speed.Because GPU (Graphic Processing Unit) has strong floating computational capability, researchers are trying to use GPU for some general purpose computation gradually in recent years in order to assist CPU. NVIDIA announced a powerful GPU architecture called Compute Unified Device Architecture (CUDA) in 2007, which makes the parallel programming much easier. Therefore, GPU has wide application prospects in the video compression field.This paper proposes a parallel encoder based on CPU+GPU, which adopts two threads design. The main thread is in charge of CPU and it is focused on reading and writing files, data transmission between host and device and controlling GPU. The sub thread is in charge of GPU and it is responsible for intra prediction, inter prediction and entropy encoding. In this paper, the whole encoding process is implemented in GPU, which not only makes full use of computation resources in GPU, but also frees CPU from the tremendous computation.For several most time-consuming modules in H.264 encoder, several effective parallel algorithms are proposed in this paper. A fixed trapezoid parallel algorithm is proposed for intra prediction. As for inter prediction, firstly we analyse the correlation in inter prediction, then change the inter encoding order, preset MVP as zero. Two macroblock-level parallel algorithms which are full search MC and three steps search MC are utilitied for inter prediction, which both include sub-pixel MC. A parallel scheme is proposed in entropy coding. By analysing controlling correlation, context correlation and storage correlation, we acquire decorrelation methods and provide the parallel architecture of entropy coding, which includes information statistics, code stream producing, code stream combination. The parallel scheme of entropy coding proposed in this paper is universal so it is not confined to realization in GPU, but can be applied to multi-core process unit, cluster, etc. The parallel scheme provides a effective solution to entropy coding that has high difficulty in parallel realization.The experimental results show that the proposed CPU+GPU parallel encoder can fully utilize GPU's computational resources, save encoding time effectively and the encoding speed can be improved significantly. The overall encoding time in parallel is about 4~6 times faster than x264 optimized by multimedia instruction set, and 35~71 times faster than x264 not optimized by multimedia instruction set.
Keywords/Search Tags:H.264, GPU, CUDA, encoding, Parallel algorithms
PDF Full Text Request
Related items