Font Size: a A A

Image High Performance Computing Library For ARM Architecture Research And Transplantation Optimization

Posted on:2023-09-13Degree:MasterType:Thesis
Country:ChinaCandidate:Y X GuFull Text:PDF
GTID:2558306911982309Subject:Engineering
Abstract/Summary:PDF Full Text Request
Image high performance computing is designed to perform fast and efficient calculations on image pixels.In order to achieve accurate and efficient computing of pixel,Intel released the IPP image high-performance computing library based on the x86 architecture.For a long time,x86 chip manufacturers not only monopolize the high-end server market,but also use their perfect software environment to make consumers rely on their products.With the rapid development of ARM architecture,its application field has expanded from mobile terminal to server.In order to break the monopoly,Huawei designed the Kunpeng 920 processor based on ARM architecture,which has strong attraction for high-performance computing load capacity.In view of the lack of image high-performance computing library in ARM ecological environment,in order to improve its software ecological environment.this article combines with the ARM architecture and the characteristics of the Kunpeng 920 processor to study image high-performance computing,and completed the IPP image library from x86 to ARM architecture transplant optimization.Firstly,this article takes the ARM architecture as the starting point,using its commonly used optimization schemes.A fast quantization scheme based on ARM NEON shift instruction is proposed for image scaling,and a corrosion expansion algorithm based on look-up table is proposed for morphology.Then it introduces the transplantation optimization scheme of four modules: data exchange,threshold,statistics and morphology of image acceleration Library Based on Kunpeng 920 processor.Based on SPEC scores of Kunpeng 920 and Intel Xeon Gold 6148,the performance benchmark of this image library compared with IPP is60%.The main work of this paper is summarized as follows:(1)Total of 239 function interfaces of the data exchange module are completed,and the average performance is IPP 89.4%.The design ideas of image copy,data type conversion and image scaling are introduced in detail.The fast quantization scheme based on ARM NEON shift instruction is successfully applied to the image scaling function interface,this algorithm uses shift instructions to replace general-purpose quantized logical multiplication and division operations to improve the execution speed of the interface.(2)Total of 185 function interfaces of the threshold module are completed,and the average performance is IPP 107.8%.The design ideas of single threshold and double threshold functions are introduced in detail,and use the negation instruction to uniformly process the floating-point special value Na N.(3)Total of 72 function interfaces of the statistics module are completed,and the average performance is IPP 69.3%.The design ideas of maximum value,mean value,histogram and other functions are introduced in detail.An image histogram statistical method based on NEON extended instruction set is proposed and applied successfully.(4)Total of 186 function interfaces of morphology module are completed,and the average performance is IPP 47.6%.The functional design ideas of corrosion expansion,opening and closing operation of advanced morphology,top hat,black hat operation and gradient in basic morphology are introduced in detail.The corrosion expansion algorithm based on look-up table is successfully applied to basic morphology,The algorithm optimizes and accelerates the corrosion expansion interface by rotating pointers,building lookup tables,etc.The performance comparison benchmarks are not met due to The data is not continuous when the left and right borders of the image are filled in the morphological operation,and the data loading takes a lot of time.The ARM architecture oriented image library designed in this paper has an average performance of IPP 78.5%,which reaching the benchmark requirements.Successfully ported and optimized the Intel IPP image library.
Keywords/Search Tags:Image high performance computing, Kunpeng 920 processor, IPP image library, Transplant optimization, performance evaluation
PDF Full Text Request
Related items