Font Size: a A A

The Design And Implementation Of A Local Multi-port Computing Acceleration Device Based On FPGA

Posted on:2022-04-08Degree:MasterType:Thesis
Country:ChinaCandidate:J H ZhuFull Text:PDF
GTID:2518306479978529Subject:Signal and Information Processing
Abstract/Summary:PDF Full Text Request
We are now in the era of Internet of Everything(Io E).The amount of data continues to increase,and data types are developing towards diversification.Due to the gradual failure of Moore's Law,traditional CPU hosts can no longer meet the need of computing massive unstructured data.To solve this problem,relying on the improvement of the system scale is not enough.Instead,it is necessary to apply different calculating architectures.According to heterogeneous computing,a local multi-port computing acceleration device based on FPGA is proposed and implemented in this paper.This device uses FPGA as the main computing unit.By offloading host computing tasks nearby,it reduces CPU usage and improves computing performance.This device contains 12 computing ports and uses arbitration mechanism to provide computing services for multiple hosts.With expansion interface,this device can be used to build heterogeneous computing systems of different scales.The hardware system of the device is based on the Xilinx ZYNQ Ultra Scale+heterogeneous processor(FPGA+ARM).The main peripherals include 12 SFP+connectors for data transmission,8 DDR4 memory particles for data storage,and 2QSFP+ connectors for device expansion.The circuit board adopts a 14-layer structure.There are 1540 components,1054 netlists and 4872 connections on the board,including17 power generation systems.The highest serial signal rate reaches 10.3125 Gbps and the highest parallel bus rate is 2666 Mbps.To solve the SI and PI problems,methods such as stacked structure arrangement,power decoupling network,strict impedance control,wiring delay constraints,and optimization of topology have been adopted in the PCB design.Combined with theoretical calculation and simulation verification,this design ensures the stability of the power distribution system and the correct transmission of signals.In the software system,due to the low latency and high concurrency of FPGA,a series of functions are realized,including high-efficiency host-device data transmission,bus arbitration for computing ports,and computing tasks hardware realization.When using this device to offload the calculation of the Tiny-YOLO convolutional neural network,it reduces time delay by 62.9% and increases throughput by 8.79 times for the host.In summary,with high flexibility and scalability,this device can locally provide heterogeneous computing capabilities for existing CPU hosts or servers,thus,has practical application value.
Keywords/Search Tags:High speed digital system, Heterogeneous computing, FPGA, Hardware acceleration, Bus arbitration
PDF Full Text Request
Related items