Font Size: a A A

Halftone Algorithm For Heterogeneous Many-core Processors

Posted on:2017-07-25Degree:MasterType:Thesis
Country:ChinaCandidate:C Y XiangFull Text:PDF
GTID:2348330485986725Subject:Computer application technology
Abstract/Summary:PDF Full Text Request
Halftone is the reprographic technique and widely used in image processing. Due to the high complexity of computation and memory access, serial halftone algorithm is not able to meet the requirement of real-time processing with the image scale increasing. However, the research work on accelerating halftone algorithm by the techniques of parallel computing is less. Work of the thesis can be divided into two parts.The first part: since image convolution is a time-consuming part during image processing, we accelerate the algorithm in depth on many-core processors. We adopt shared memory optimization and register tiling optimization to accelerate horizontal 1-dimensional convolution, vertical 1-dimensional convolution and 2-dimensional convolution. The experimental results show that, owing to different architectures in many-core processors, different acceleration strategies result in different effect.The second part: we study the implementation and optimizations of parallel halftone algorithm on heterogeneous many-core processor. First, we adopt local neighborhood to significantly reduce computational complexity in the serial halftone algorithm. Then, we adopt Poisson disk to eliminate the data dependency, so the algorithm can be parallelized efficiently. Last, we adopt a series of optimizations to fully exploit the performance of GPU, including shared memory optimization, task granularity refinement, optimizations for reduction operation, data broadcast based on constant memory and multi-dimension cache based on texture memory. The experimental results show that, on heterogeneous platform of Intel Xeon CPU + Tesla K20 GPU, the optimized parallel halftone algorithm achieves 5 to 18 times improvement compared with the na?ve parallel halftone algorithm, and 95 X to 110 X speedup over the serial halftone algorithm. On Tegra K1 platform for mobile device, the optimized algorithm achieves 28 X to 32 X speedup over the serial halftone algorithm. On Tegra X1 platform, the optimized algorithm achieves 50 X to 61 X speedup over the serial halftone algorithm.
Keywords/Search Tags:image halftone, many-core processors, image convolution, heterogeneous computing
PDF Full Text Request
Related items