Font Size: a A A

The Optimization And Implementation Of X.264 Video Encoder On The Platform Of Graphics Processing Unit

Posted on:2016-03-30Degree:MasterType:Thesis
Country:ChinaCandidate:D JiangFull Text:PDF
GTID:2428330473464980Subject:Computer application technology
Abstract/Summary:PDF Full Text Request
Graphics processing unit(GPU)has very high computational performance and relatively low cost.It refreshes its hardware with a speed faster than Moore's law,and has gained continuous progress in the field of General Purpose computing Graphics Processing Unit(GPGPU).The industrial community with Nnivida cooporation as the representative has produced a series of GPU with strong computational capability of float numbers and high paralalism.The launch of programm able GPU makes the universal computing with GPU become a hot research topic.H.264/AVC is a video coding standard jointly promoted by both ISO/IEC and ITU-T.It achieves excellent compression ratio and network adaptation.However,it adopts new coding feat ure tools such as variable block size motion estimation/compensation,multiple intra-prediction modes,which leads to a high computational complexity of encoder.Different from the verification model of H.264/AVC,X.264 is an open source encoder which is compatible with H.264/AVC standard.Therefore,X.264 achieves desirable performance and has relatively higher practical value.Motivated by the facts of the computation requirements of video encoder and the compuational capability of GPU,this thesis researches on the optimization and implementation of X.264 encoder on the platform of GPU.It fully exploits the strong computational capability of float numbers and high paralalism.The main works and contributions are summarized as follows:First,the structure of X.264 video encoder is analyzed,and especially its drawbacks in function hierachy,data structure hierachy and operation types.Moreover,the possibility of X.264 encoder parallism and its platform transfering are also investigated.The Intel VTune performance analyzer is utilized to conduct the practical test for the performance of X.264 encoding.By making stastics of the test results,the time consumptions of the main functions and each function module in X.264 encoder are obtained.Apparantly,this plays a solid foundation for the selection of computation-intensive and paralism-potential module in the X.264 video encoder,which is to be further investigated by GPU optimization.Second,after an analysis of the motion estimation algorithms adopte d by X.264,a parallel optimization strategy is proposed for the SAD compuatation and its comparison among different blocks in the motion estimation module.Moreover,the proposed parallel processing is implemented on the GPU platform with Compute Unified Device Architecture(CUDA).It is known that motion estimation is the most computation-intensive step for H.264/AVC encoder,which occupies more than 60% computation time.However,the motion estimation in X.264 encoder is block-based and has very high parallelism.This makes it suitable for the GPU implementation.The matching criterion,i.e,SAD computation and its comparison is implemented by GUP with single instruction and multiple thread(SIMD),and thus its acceleration is achieved.The experimental results on several typical video sequences such as Foreman show that the proposed GPU implementation can improve the efficiency 6-8 times by exploiting the parallism of full-search motion estimation.Moreover,the higher is the spatial resolution and the bi gger is searching range,the more significant acceleration results are achieved.Compared with the GPU optimization of the motion estimation module in the H.264/AVC reference software,the accelearation ratio is almost the same,but the time consumption is reduced.
Keywords/Search Tags:H.264/AVC, X.264, Graphical Processing Unit(GPU), Compute Unified Device Architecture(CUDA), motion estimation
PDF Full Text Request
Related items