Research And Implementation Of End-to-Side Inference Accelerator For Convolutional Neural Network Based On ZYNQ

Posted on:2022-08-05

Degree:Master

Type:Thesis

Country:China

Candidate:S W Cui

Full Text:PDF

GTID:2518306338974099

Subject:Master of Engineering

Abstract/Summary:

PDF Full Text Request

The Convolutional Neural Network(CNN)algorithm is an important research direction in the current artificial intelligence field,especially in the field of computer vision.It has made huge breakthroughs in target detection,image tracking,image classification and other research directions.However,with the improvement of network performance,the computational density and memory requirements of algorithms have also risen sharply,making it difficult to deploy and apply models on embedded end-side devices with limited computing and storage resources and energy consumption,thus restricting computer vision Development and promotion.Because FPGA has high performance,low power consumption,programmable characteristics and flexible and efficient parallel design architecture,it becomes the best platform for convolutional neural network end-to-side reasoning acceleration.This article is based on the ZYNQ platform to study the convolutional neural network end-to-side reasoning accelerator.The paper is optimized and designed from the two perspectives of CNN algorithm and hardware architecture.In the optimization of the CNN algorithm,firstly carry out operator fusion and splitting and reorganization to improve the execution efficiency of the algorithm on the hardware,and then use the dynamic fixed-point low-bit-width data accuracy to quantify the model,thereby reducing the hardware resources while ensuring the accuracy of the model.Occupy,improve network computing speed.In the optimization of the hardware architecture,the convolution and pooling layer containing a large number of parallel calculations is placed on the FPGA side for calculation.By analyzing the calculation parallel characteristics,the appropriate loop unrolling method and parallelism are selected to maximize the convolution calculation speed.After that,pipeline technology is used to improve the throughput of convolution calculation,and strategies such as ping-pong buffer and multi-channel data transmission are used to optimize memory access bottlenecks.After that,using YOLOv4 tiny as the target algorithm,the complete mapping process from the CNN model to ZYNQ is described.Finally,this article uses ZYNQ-7020 as the target platform to design and implement the accelerator hardware system and software system,and comprehensively evaluate the accelerator from four aspects:detection accuracy,resource occupancy,speed,and power consumption.The results show that the convolutional neural network accelerator design proposed in this paper can provide high computing performance with limited resources and power consumption,has a high degree of configurability and portability,and is suitable for embedded end-side platform CNN model inference accelerate.

Keywords/Search Tags:

convolutional neural network, end-to-side accelerator, ZYNQ, YOLO(You Only Look Once)v4tiny

PDF Full Text Request

Related items

1	Design And Optimization Of Tiny YOLO Convolutional Neural Network Accelerator
2	Zynq-based Accelerator Design For Deep Convolutional Neural Networks
3	Zynq-based Convolutional Neural Network Embedded Acceleration System Design
4	Research And Design Of YOLO V2 Neural Network Accelerator Based On FPGA
5	Design And Implementation Of Convolutional Neural Network Accelerator Based On ZYNQ
6	ZYNQ-Based Reconfigurable Convolutional Neural Network Accelerator
7	Design And Application Research Of Convolutional Neural Network Accelerator Based On ZYNQ Platform
8	Classification of Road Side Material Using Convolutional Neural Network and a Proposed Implementation of the Network Through Zedboard Zynq 7000 FPG
9	Study And Design Of Special Convolution Neural Network Inference Accelerator Based On Face Detection Yolo Algorithm
10	Design Of General-purpose Convolutional Neural Network Accelerator Based On FPGA