Design And Implementation Of A Configurable Convolutional Neural Network Accelerator Based On ZYNQ

Posted on:2024-03-28

Degree:Master

Type:Thesis

Country:China

Candidate:H G Yang

Full Text:PDF

GTID:2568307151966259

Subject:Electronic information

Abstract/Summary:

PDF Full Text Request

This paper proposes a configurable convolutional neural network(CNN)accelerator based on ZYNQ,which can not only build various CNN models to perform edge inference tasks but also adjust the accelerator’s hardware resource usage by modifying configuration parameters to make better use of the current hardware platform.The main contributions of this work are as follows:(1)To address the issue of large data parameter volume in CNNs that cannot be fully loaded into on-chip cache space,this accelerator integrates multiple data partitioning methods and proposes an adaptive data reuse strategy that reduces the total parameter volume of data transmission by comparing and analyzing different reuse methods.(2)To meet the demand for fast construction of CNNs,this work encapsulates the required parameters of CNNs and defines a dedicated CNN instruction set.Users can quickly build multiple CNN models by calling this instruction set.(3)To facilitate the application of this accelerator on different hardware platforms,this paper proposes a software-hardware co-configuration scheme.Data bit width,MAC array parallelism,and intermediate cache space size are considered as configurable parameters,and different configuration methods can be selected to adjust the overall resource usage to adapt to the FPGA hardware platform.(4)To achieve lower power consumption for the same throughput,a clock domain partitioning solution is proposed.The core computing module works in a high-frequency clock domain,while the non-core module works in a low-frequency clock domain,which further optimizes circuit timing.Experimental validation is conducted on the Xilinx ZCU104 board.The experimental results show that when the MAC array parallelism is set to 1024,the data bit width is set to 8,and the core acceleration engine operates at 180 MHz,the peak throughput of the accelerator is 180 GOPS,the power consumption is 3.752 W,and the energy efficiency ratio is 47.97 GOPS/W.For the VGG16 network,the average MAC utilization rate of the convolutional layers reaches 84.37%.These results demonstrate that the proposed accelerator has good configurability and high efficiency performance and has a wide range of applications in edge computing and embedded devices.

Keywords/Search Tags:

Convolutional neural network, FPGA, Hardware Acceleration, software-hardware co-configuration

PDF Full Text Request

Related items

1	Research On Edge Oriented FPGA Software Hardware Collaborative Convolutional Network Acceleration
2	Software And Hardware Co-design Of Convolutional Neural Network Accelerator Based On FPGA
3	Research On Image Recognition Acceleration Method Based On FPGA Hardware And Software Collaboration
4	Acceleration System Design And Implement For Convolutional Neural Network Based On SOC FPGA
5	Research And Implementation Of Convolutional Neural Network Accelerator Based On FPGA
6	Co-design And Implementation Of Hardware/Software Of Convolutional Neural Network Based On FPGA
7	Design And Application Of Convolutional Neural Network Accelerator For Image Classification Based On ZYNQ
8	Design And Optimization Of Shifted Convolutional Neural Network Based On FPGA Platform
9	Design And Implementation Of Reconfigurable Special Hardware Accelerator For Convolutional Neural Network Based On FPGA
10	Research On Lightweight Convolutional Neural Network Algorithm And Hardware Collaborative Acceleration Technology