Font Size: a A A

FFT Implementation And Optimization On ARM V8 Platform

Posted on:2019-10-10Degree:MasterType:Thesis
Country:ChinaCandidate:T ChenFull Text:PDF
GTID:2428330545977174Subject:Computer software and theory
Abstract/Summary:PDF Full Text Request
The ARM V8 is the first ARM processor architecture to support 64-bit instruction sets.It's computing power has been greatly improved and application area has become more extensive.FFT(Fast Fourier Transform)is a fast algorithm for calculating Discrete Fourier Transform(DFT)or its inverse operation.It is widely used in engineering,science and mathematics.So far,there is little implementation and optimization of high-performance FFT algorithm based on ARM platform.However,with the development of application of ARM V8 processor,it is increasingly important to develop the high performance of FFT algorithm on ARM platform.This paper implements and optimizes a high-performance two-dimensional FFT algorithm library on the ARM V8 platform which is perfFFT.It is optimized by FFT butterfly network optimization,butterfly optimization,butterfly automatically generation,SIMD optimization,assembly optimization,memory alignment,Cache-aware blocking algorithm,efficient transpose and other optimization methods.These approaches greatly enhance the FFT algorithm performance.The results of our numerical experiments show that PerfFFT achieves a 10%to 591%performance improvement compared to the current application of the most extensive open source FFT library(FFTW3.3.6)and 13%to 44%performance improvement compared to ARM high-performance commercial library(ARM Performance Library).
Keywords/Search Tags:ARM V8, FFT algorithm, FFTW, ARMPL, SIMD optimization, cache use, matrix block
PDF Full Text Request
Related items