Font Size: a A A

Fast Computation Research Of Zernike Moments Based On GPU

Posted on:2019-01-02Degree:DoctorType:Dissertation
Country:ChinaCandidate:Y B XuanFull Text:PDF
GTID:1368330548956713Subject:Applied Physics
Abstract/Summary:PDF Full Text Request
Zernike moments are orthogonal coefficients projected on a set of Zernike polynomials by image.It can remove the redundant information of the images,and it is robust to the image noise and other related operations.And also,its amplitude essentially is the rotation,translation and scale invariance because the Zernike polynomial is defined in the complex domain of the unit circle.Therefore,Zernike moments are widely used in image analysis,pattern recognition and optics due to these excellent properties.Due to the complex definition of Zernike moments,the slow speed of calculation,and the calculation precision,the algorithm of the precision for Zernike moments in the projections of the incircle and the outcircle is determined;and pre-storing radial polynomal coefficients to eliminate the limited precision caused by factorial calculation is proposed;approaches to fast computition of Zernike moments are proposed and the difficulties of implementing them on GPU are analyzed.Finally,efficient optimization approach for fast GPU computation of Zernike moments is caried out.The main contributions and novelty of this dissertation are summarized as follows.(1)The order of single and double precision can only be calculated up to 18 and 40 due to the factorial calculation in Zernike moments.Pre-storing radial polynomal coefficients is proposed because it effectively eliminates the constraints of factorial and the calculation of single moment in recursive algorithm.The computation time is lesser than the recursive algorithm and stored factorial algorithm while order of Zernike moments is improved.(2)GPU octant symmetric algorithm and data re-layout scheme that involves reordering the image pixels and pre-addressing the diagonal pixels are proposed.Thread divergences that causes a large number of conditional statements to extract symmetry points are avoided.Maximum memory bandwidth is achieved after the data re-layout.(3)The above proposed pre-stored radial polynomial coefficients combines proposed octant symmetric algorithm.The hybrid algorithm further shortens the computation time and overcomes the limitation that the recursive algorithm cannot find a single moment.Its computational performance is better than the direct method in the large size images on GPU.(4)The scheme packeted thread blocks to one kernel is proposed.The scheme greatly improves the resource utilization and takes back the computational performance lost in small size images when the proposed hybrid algorithm is used.Regarding the methodology of GPU acceleration,there are three contributions:(1)Re-layout scheme can effectively avoide thread divergences caused by a large number of conditional statements to extract the special points if the processing areas are fan-shaped or irregular.Reordering is the key to data re-layout,although it requires additional overhead,but coalesced access to global memory maximizes the performance of GPU in subsequent processing.(2)GPU memory space is used effectively.Data pre-calculated are stored in GPU memory which can later be extracted by searching in the form of lookup tables.The constant memory storing strategy is particularly efficient in minimizing the repeated execution of many highly frequent computations,saving computational time.In short,it is the strategy that memory space trades with computation time.(3)If GPU Stream Multiprocessors are idle,it leads to low occupancy and idle resources.And if there are same executable code and different input and output parameters in an application,using the scheme packeted thread blocks can improve the utilization rate of resources and further reduce computation time.For instance,calculation of any other type of moments and the calculation of Gauss Pyramid in SIFT image matching algorithm conform to our package conditions.The lower occupancy rate,package scheme is more effective.This dissertation uses Nvidia GPU and CUDA(Compute Unified Device Architecture)to accelerate the design and introduces GPU hardware architecture and CUDA programming model.The direct method,the re-storing radial polynomial coefficients,four quadrant symmetry,octant symmetric algorithm and scheme packeted thread blocks are analysised in detail.The conclusions of this research include;GPU can attain a gain of up to hundreds of times faster in the calculation of Zernike moments compared to CPU;proposed that hybrid algorithm has significt speed-up compared to the direct method in large size images(512 and 1024).The speed-up of a minimum 5.5 and a maximum of 8.6 are reported in computing a set of Zernike moments for a given oreder;the speed-up of a minimum 2.4 and a maximum of 13.5 were achieved using the hybrid algorithm through the package scheme.Contrapose the phenomenon that the full resources in the computation for Zernike moments in large scale images,the multi GPU segmentation scheme is proposed and supports hybrid algorithm in multiple GPU.
Keywords/Search Tags:Zernike moments, GPU, octant symmetry, package scheme, radial polynomial
PDF Full Text Request
Related items