Design And Implementation Of A Reconfigurable Convolutional Neural Network Accelerator Based On FPGA

Posted on:2022-03-19

Degree:Master

Type:Thesis

Country:China

Candidate:T S Chen

Full Text:PDF

GTID:2518306539968619

Subject:Circuits and Systems

Abstract/Summary:

PDF Full Text Request

In recent years,Convolutional Neutral Network(CNN)has been widely used in many fields such as computer vision,speech recognition and document analysis.Although CNN has shown excellent performance in many application scenarios,it comes at the cost of high computational complexity.In many application scenarios,CNN needs to perform forward inference on embedded platform for the purpose of real-time,data security and so on.Because many embedded platforms have strict power consumption,computing,and memory cost constraints,efficient processing of forward inference is essential.Field Programmable Gate Array(FPGA),as a kind of semi-customized hardware circuit,has the characteristics of flexible design,high performance and power consumption ratio,and has gradually become the research hotspot of CNN hardware acceleration.Accelerators designed for a CNN can easily achieve the full computational throughput of FPGA.However,this kind of accelerator can only run the designed CNN,or the performance is not high when running other networks,so it is of great significance to design a reconfigurable CNN accelerator.Focusing on the reconfigurable hardware architecture,performance,power consumption,and other aspects,this paper will develop the design and implementation of the reconfigurable CNN accelerator based on FPGA.The main work of this paper is as follows:1)Analyze of the convolutional computing characteristics of CNN,and design a reconfigurable computing clusters and a reconfigurable on-chip buffer.The reconfigurable computing cluster can support functions such as convolution calculation and nonlinear activation function.The reconfigurable on-chip buffer can perform zero padding of the input feature map,overlap of feature map tiling and perform data transmission in a certain order.The five-stage pipeline structure of the reconfigurable computing cluster can fully reuse the DSP resources of the FPGA,which effectively improves the computing power of the accelerator.The reconfigurable on-chip cache makes full use of the data transmission characteristics of DMA(Direct Memory Access)to improve the efficiency of data transmission.2)Based on the designed hardware accelerator,a calculation method for finding the optimal feature map tiling parameters is proposed.This calculation method can evaluate the calculation performance and data transmission bandwidth of the accelerator according to the size of different convolutional layers,and find the optimal feature map tiling parameters,thereby realizing the optimal performance of the accelerator.3)This paper selects three CNNs of VGG16,Res Net50 and YOLOv2-tiny as the test benchmarks,and quantifies the network to 16-bit fixed point without fine-tuning the network.Among them,the quantization errors of the Top-1 and Top-5 accuracy rates of VGG16 and Res Net50 are both less than 3%;the quantization error of the mean average accuracy of YOLOv2-tiny is less than 3%,and the recall rate is reduced by less than 1%.4)The FPGA-based reconfigurable accelerator completes the design verification on the Xilinx Zynq ZC706 evaluation board.At a clock frequency of 200 MHz,the real-time performance of VGG16,Res Net50 and YOLOv2-tiny reached 163.0 GOPS,107.9 GOPS and121.2 GOPS,respectively.The FPGA chip power are 7.6W,6.8W,6.7W,and the evaluation board power are 20.4W,19.8W,and 19.2W respectively.

Keywords/Search Tags:

FPGA, convolutional neural network, reconfigurable, accelerator

PDF Full Text Request

Related items

1	Design Of Reconfigurable Convolutional Neural Network Accelerator And SOC System
2	Design And Implementation Of A High-performance Accelerator Dedicated For Convolutional Neural Networks
3	FPGA-Based Accelerator For Convolutional Neural Network
4	Design Of General-purpose Convolutional Neural Network Accelerator Based On FPGA
5	Research On Heterogeneous Reconfigurable Dataflow Accelerator For Big Data Applications
6	Design Of Hardware Accelerator Based On FPGA For Convolutional Neural Networks
7	Compression Algorithm And Circuit Design Of Convolutional Neural Networks
8	Research On Key Technologies Of Reconfigurable Neural Network Accelerator Design
9	Research Of Scalability On FPGA-based Neural Network Accelerator
10	Implementation Of Convolutional Neural Network Accelerator Based On FPGA In Intelligent Orthokeratology Matching Algorithm