Font Size: a A A

Implementation And Optimization Of Mixed-radix FFT On Sunway Platform

Posted on:2021-01-28Degree:MasterType:Thesis
Country:ChinaCandidate:Y C LuoFull Text:PDF
GTID:2428330602973816Subject:Information and Communication Engineering
Abstract/Summary:PDF Full Text Request
Fast Fourier transform(FFT)is one of the basic algorithms in the field of digital signal processing,which has a wide range of applications in scientific computing,image processing and so on.Sunway series processors are domestic processors independently researched and developed in China,which have completely independent intellectual property rights.It plays an important role in the fields of scientific computing and information security in China.Sunway 26010,a processor with the Sunway 260 core,is the basic computing device of "Sunway Taihu light" supercomputer system.Sunway Taihu light has won the top 500 for four consecutive times since it was released in June 2016.It is currently ranked third in the latest list of global supercomputer 500.In the field of Sunway multi-core processor,Sunway1-core,2-core and 4-core processors also carried out the applicability research in the embedded field and achieved some application results.However,at present,there is no customized and efficient FFT algorithm for Sunway embedded processor.The current FFT algorithm can not take the advantages of Sunway processor in vector and cache,and the performance of the algorithm has a large space for improvement.Therefore,it is of great significance to implement efficient FFT algorithm on Sunway embedded platform.To solve this problem,this thesis focuses on the implementation of efficient FFT algorithm on Sunway embedded processor.In this thesis,the radix-4-radix-2 mixed radix FFT algorithm is implemented and optimized from three parts: butterfly computing unit,rotation factor table and bit order transformation.Finally,combined with the architecture characteristics of Sunway 221 processor,the FFT algorithm is optimized for Sunway platform by using loop expansion optimization,manual SIMD vectorization optimization and manual data prefetching optimization.Finally,the FFT algorithm is improved in Sunway 221 processor Single core performance on.The main work of this thesis is as follows:(1)This thesis studies and analyzes several common FFT algorithms,includingradix-2 FFT algorithm,radix-4 FFT algorithm and mixed Radix FFT algorithm,etc,and analyzes their advantages and disadvantages.Based on the platform of Sunway,the algorithm of radix-4-radix-2 mixed Radix FFT is implemented,and the FFT algorithm is improved from three aspects: simplifying the butterfly computing unit,rearranging the rotation factor table and using the look-up table to optimize the bit order transformation operation.(2)Combined with the architecture features of Sunway 221 processor,the mixed radix FFT algorithm is optimized for Sunway platform.The methods of loop expansion optimization,SIMD vectorization optimization and data prefetching optimization are proposed to improve the FFT calculation efficiency of Sunway 221 processor.The experimental results show that the average speedup ratio of the optimized algorithm is 3.01.Under the condition that the error between the FFT algorithm on Sunway and TMS320C6678 is less than 1/10000,the Sunway single core performance of the FFT algorithm is 70% of that of TMS320C6678.It shows that Sunway 221 embedded processor can replace TMS320C6678 processor in some fields.
Keywords/Search Tags:Sunway processor, FFT, look-up table method, loop expansion, SIMD vectorization, data prefetching
PDF Full Text Request
Related items