Font Size: a A A

The Implementation Of Image Post-processing Algorithm Based On GPU Platform

Posted on:2018-01-07Degree:MasterType:Thesis
Country:ChinaCandidate:X JiangFull Text:PDF
GTID:2428330596489211Subject:Major in Electronic and Communication Engineering
Abstract/Summary:PDF Full Text Request
As a common image post-processing algorithm,frame rate up-conversion(FRUC)algorithm is the technique that increases the frame rate by generating the intermediate frame between the original frame in the video,and it can be applied in many fields.3-D Recursive Search(3DRS)is a good algorithm based on the block,which utilizes the spatial and temporal correlation between the neighboring block to improve the performance and reduce the computational complexity.But the algorithm based on the block lead to the blur along motion edge and introduce blocking effect in interpolation.Therefore,many researchers propose the FRUC algorithm based on optical flow.While the optical flow method overcome the problem mentioned above,it results in huge computational load.At present,the development of GPU general purpose computation provide a new parallel method to speed up image processing algorithm.In this paper,we propose a new parallel-friendly frame rate uop-conversion algorithm,and study the realization and optimization of the proposed algorithm on CUDA.At first,we propose a new parallel-friendly frame rate up-conversion algorithm,which is suitable for the realization on GPU.The proposed algorithm can be divided into three steps:Motion Estimation(ME),Motion Vector Post-Processing(MVPP)and Motion Compensated Interpolation(MCI).ME is based on the Patch Match to realize parallelization,and Patch Match is adapted for the application of FRUC.Self-similarity patch is used to improve the accuracy of ME.In MVPP step,we propose a method based on the information of self-similarity patch and the consistency of Motion Vector Fields(MVF)to handle the occlusion problem.In MCI step,we transform the forward and backward MVF into bidirectional MVF for interpolation.In addition,we propose a hierarchy motion estimation method to reduce the computational load on high resolution video.Secondly,this paper study the realization and optimization of proposed algorithm on CUDA.For all the steps in the algorithm,we describe the resonable configuration of thread and efficient utilization of memory to make full use of hardware resources.For the details of the algorithm,some effective algorithm is applied to improve the performance,like Reduction and Odd-even sort.The CUDA implementation achieves up to 46 times speedup over its CPU implementation.In the experiment part of the paper,we compare the interpolation effect of propoed algorithm with the algorithm based on block and optical flow on low resolution video.The result shows the proposed algorithm has a better effect than the block algorithm,and is close to the optical flow algorithm.Besides,we compare the performance of different algorithm on high resolution video.The proposed algorithm has obvious advantage in performance,especially compared with optical flow algorithm.For the hierarchy motion estimation version,it reduces huge computational load with only a few loss of effect compared with original proposed algorithm,and achieves the balance between speed and effect.
Keywords/Search Tags:Parallel-friendly frame rate up-conversion algorithm, parallel true motion estimation, CUDA, GPU
PDF Full Text Request
Related items