Font Size: a A A

Research On Parallel Acceleration Of High Definition Video Real-Time Defogging And H.264 Decoding Based On OpenCL

Posted on:2017-02-19Degree:MasterType:Thesis
Country:ChinaCandidate:Z J WangFull Text:PDF
GTID:2428330488979851Subject:Information and Communication Engineering
Abstract/Summary:PDF Full Text Request
Over the past few decades,multimedia processing has received much attention and the demand for the creating and processing of high definition multimedia content,especially high definition videos and images,has been increasing greatly.However,due to the large number of data in high definition multimedia content,we need high performance computing powers to process and analysis them.Along with the popularization of CPU+GPU heterogeneous computing environment and the rise of OpenCL heterogeneous parallel computing,the large scale parallel computing power of GPU can be used to improve the performance of several orders of magnitude.Therefore,this paper has carried out the research on parallel acceleration of high definition video real-time defogging and H.264 decoding based on OpenCL,as follows:1)At present,researches on single image dehazing mainly focus on improving the defogging effect and reducing computational complexity.As for the research to accelerate the realization of defogging algorithms,it is still very limited.The resolution of high definition video is much higher than ordinary video,which makes the calculation amount of high definition video real-time dehazing is huge,requires faster defogging speed.In this paper,we propose our parallel implementation and optimization for high definition video real-time dehazing based on a single image haze removal algorithm(FHRUSI algorithm)using OpenCL.We firstly implement the CPU serial version of the algorithm and the basic OpenCL parallel program,then we optimize it according to the characteristics of our embedded SOC hardware platform and the algorithm itself.Our optimization takes full advantage of the AMD GPU memory hierarchy,while reducing the memory access latency and increasing the parallel degree of the algorithm,which greatly reduces the execution time.We ported the OpenCL parallel optimization version as an independent module into the open source multimedia framework FFMPEG,the experimental results show that we can process 1080p(1920×1080)high definition hazed video at a real-time rate(more than 41 frames per second)and the high definition video haze removal effect is good.Our optimized implementation obtains performance acceleration with more than 4.8 times.2)In this paper,we propose the parallel implementation and optimization for the H.264 inverse discrete cosine transform(IDCT)algorithm based on the open source framework FFMPEG using OpenCL.First,we separate the IDCT from the cycle of macroblock decoding,and according to the block size,we rewrite it to two OpenCL kernel functions that execute on GPU.Second,offloading the IDCT computing task from the CPU side to the GPU side will bring additional overhead(memory copy and OpenCL runtime).Therefore,we further optimize the OpenCL program,including CPU-GPU communication optimization,local memory optimization and further optimization.Experimental results show that the optimized GPU kernel has obtained remarkable speedup compared to the corresponding IDCT SIMD version executing on the CPU.However,when taking into account the memory copy and OpenCL runtime overhead,the performance of complete IDCT is slower than the CPU SIMD version,our implementation does not obtain performance acceleration at the application layer.
Keywords/Search Tags:Haze Removal, OpenCL, High Definition video, Real Time, GPU, H.264, Parallel Programming
PDF Full Text Request
Related items