Software And Hardware Co-design Of Convolutional Neural Network Accelerator Based On FPGA

Posted on:2024-04-10

Degree:Master

Type:Thesis

Country:China

Candidate:J X Tian

Full Text:PDF

GTID:2568306926466184

Subject:Control Science and Engineering

Abstract/Summary:

PDF Full Text Request

With the increasing depth of convolutional neural networks,the increasing amount of computation and data brings great pressure to the hardware platform.FPGA platform not only has the advantages of low power consumption and high parallelism,but also can be repeatedly programmed to reduce the difficulty of development.It has become the focus of the research of accelerated convolutional neural networks on hardware platforms.In this paper,hardware and software co-design of convolutional neural network accelerator is carried out based on ZYNQ platform which integrates FPGA and ARM processor.The main work contents are as follows: Firstly,aiming at the problem of large number of network parameters,the network structure and weight parameters of YOLOv3-Tiny are analyzed,and 16-bit fixed-point quantization is used as the quantization scheme of this paper.In order to solve the problem of data transmission times and long time between DDR and BRAM,the normalization layer and activation function are designed to reduce the number of data cycles and reduce time consumption.4 times 4 parallel multiplicative array is designed according to various acceleration schemes on FPGA and resources of ZYNQ platform.Secondly,according to the parallel optimization scheme of bus transmission protocol and design,the hardware module is designed by Vivado HLS high-level comprehensive tool at the PL end and the optimization strategies such as multi-channel transmission and ping-pong water flow are adopted to reduce the system delay.Finally,three modules of image preprocessing,hardware IP driver,result calculation and display are designed in PS terminal.The experimental results are analyzed and compared with the acceleration effect of other hardware platforms.The designed Zynq-based accelerator has a throughput of 17.76 GOPS and a performance20 times that of the CPU(Inter i5-9300h).The energy consumption ratio is8.29GOPS/W.It is 34 times of CPU(Inter i5-9300h)and 11.5 times of GPU(GTX1060),realizing the design purpose of low power consumption and high performance.

Keywords/Search Tags:

convolutional neural network, FPGA, hardware acceleration, hardware optimization

PDF Full Text Request

Related items

1	Research On Edge Oriented FPGA Software Hardware Collaborative Convolutional Network Acceleration
2	Design And Optimization Of Shifted Convolutional Neural Network Based On FPGA Platform
3	Design And Implementation Of Convolutional Neural Network Acceleration Based On FPGA
4	Research On CNN Network Acceleration For Image Classification Based On FPGA
5	Acceleration System Design And Implement For Convolutional Neural Network Based On SOC FPGA
6	Design And Implementation Of Reconfigurable Special Hardware Accelerator For Convolutional Neural Network Based On FPGA
7	Research On Lightweight Convolutional Neural Network Algorithm And Hardware Collaborative Acceleration Technology
8	Design And Implementation Of Convolutional Neural Network Hardware Accelerator
9	Design And Implementation Of A Configurable Convolutional Neural Network Accelerator Based On ZYNQ
10	Research And Implementation Of Convolutional Neural Network Accelerator Based On FPGA