Font Size: a A A

Design And Implementation Of Ping-Pong Linear Merge Sorting Algorithm Based On FPGA

Posted on:2022-02-22Degree:MasterType:Thesis
Country:ChinaCandidate:Y FengFull Text:PDF
GTID:2518306329459584Subject:Computer application technology
Abstract/Summary:PDF Full Text Request
In the era of big data and the Internet of Things,data interacts mostly as streams in data centers and a range of embedded devices.Based on FPGAs we can design multistage flow structures with high throughput and low power consumption,so it is suitable for various stream computing tasks that arrive continuously.Sorting is also a key operation widely used in databases,image processing,data compression and other areas with streaming computing characteristics.For the sorting of streaming data,the linear sorting algorithm based on non-strict partial ordering sequence has been proved to have higher performance than other algorithms.As a new FPGA high-performance sorting algorithm,this algorithm expands the mathematical order theory.And combined with the structural characteristics of FPGA to achieve a full-pipeline,high-concurrency design.Although linear sorting algorithms based on non-strict partial ordering sequences have more advantages in speed than other sorting algorithms,and at the same time have good scalability and scalability,they can only sort thousands of 32-bit data on FPGA at most.This kind of data processing scale greatly limits the application scenarios of FPGA in the actual sorting problem,so it is particularly important to design an FPGA sorting algorithm that can sort a larger scale.In order to solve this problem,in this article we propose a new ping-pong linear merge sorting algorithm.It comprises three parts: ping-pong linear sorter,BRAM FIFO and merge tree.In the input stage,the input data is divided into different ordered subsequences and cached in the BRAM FIFO by ping-pong switching of two linear sorters.The merge tree is then constructed layer by layer at a rate of one layer every two cycles,after the construction is completed,n numbers can be output per cycle.To decouple the dependencies between layers,in design of the merge tree we add buffer cells.For n metadata nodes the cache cell size is 2n.It makes our merge algorithm more scalable than the FPGASORT work and other merge tree sorter.Besides that,the entire linear merge sorter is fully-pipelined.Ignoring the constant time for the merge tree construction,the time complexity is O(N),Same as the most advanced linear sorter.we implemented our algorithm on FPGA with the System Verilog language.First we use a finite state machine(FSM)to switch input and output state of the two linear sorters and jump state of the merge tree.in the implementation of the merge tree we design the micro-core,node unit(NU),to encapsulate intra-node comparison,storage and inter-node communication.NUs can be customized according to data width,data parallelism and throughput.All NUs run synchronously each cycle and comprise the entire merge tree through simple communications.Compared with previous work,our ping-pong linear merge sorter has high throughput and good scalability.Meanwhile,it expands the amount of data processed on the FPGA.Our experiments on the Xilinx KCU1500 FPGA accelerator board show that the it can reach up to 4.16 Gbps throughput,which is 29.71 times higher than traditional fast sorting algorithms on CPUs,and1822720 32-bit data processing capacity,which is 227.2 times larger than the most advanced linear sorting algorithm.
Keywords/Search Tags:Sorting algorithm, Field Programmable Logic Gate Array(FPGA), Streaming Data, Parallel Computing
PDF Full Text Request
Related items