Research On Neural Network Accelerator Based On PYNQ

Posted on:2022-04-09

Degree:Master

Type:Thesis

Country:China

Candidate:Z J Zhou

Full Text:PDF

GTID:2518306353976929

Subject:Electronics and Communications Engineering

Abstract/Summary:

PDF Full Text Request

In recent years,the use of neural networks to process complex visual tasks has become a hot topic,such as gesture recognition,face detection,and medical imaging.In addition,with the rapid development of the Internet of Things,the demand of embedded field for highperformance visual processing is higher and higher.However,how to deploy the convolutional neural network to the resource-constrained device has become a hot topic.Due to the low power consumption and high speed,FPGA is very suitable as a choice in Internet of Things scenario.So,this article will use FPGA to design a high-parallel and low-power So C based on some optimization solutions.This article aims to research and deploy Convolutional Neural Networks,and chooses YOLOv3-tiny as the target network.Meanwhile this article also explores the possibility of parallel computing on FPGA.Since implementing floating-point operations on FPGA is wasteful of resources,this article improves on the existing quantization algorithm.This solution which is improved not only has lower computational complexity compared to the Google solution.,but also ensures network performance.Then,this article explores the possibility of mixed precision quantization,and finds a suitable hybrid quantization scheme through iterative algorithms.While reducing data storage,the mean average precision of the network is acceptable.In the part of hardware design,this article uses high-level synthesis tools,and the IP cores designed include convolution IP,batch multiplication and accumulation IP,pooling IP,and upsampling IP.After optimizing the IP core and the overall system,the design space exploration model is designed to evaluate the system delay.Based on the design space exploration model,we can find the appropriate parameters.In this research work,the overall system design will be realized on PYNQ.Meanwhile,the usage of system resources and energy consumption are shown in table.In addition,it is also compared with other similar work.After testing,the accelerator system designed in this paper is 86 times faster than ARM A9 in single frame detection,and has obvious advantages over CPU and GPU in terms of energy consumption.The design of this article meets the real-time requirements as much as possible under resource constraints,and also provides a reference for deploying neural networks on other heterogeneous platforms.

Keywords/Search Tags:

Convolutional Neural Network, Hardware Acceleration, High-level Synthesis, Quantization, PYNQ-Z2

PDF Full Text Request

Related items

1	Research On Parallel Computing Architecture Of Siamese Network Algorithm
2	A Convolutional Neural Network Accelerator For Limited Hardware Computing Resources
3	FPGA-based Human Action Recognition Algorithm Acceleration And Implementation
4	Research On Convolutional Neural Network Acceleration Framework For Cloud-based FPGAs
5	Design And Research Of Convolutional Neural Network Accelerator Based On PYNQ Embedded Platform
6	Accelerated Design And Implementation Of SSD Algorithm Based On FPGA
7	Research And Implementation Of Image Classification And Recognition Technology Based On PYNQ
8	Research On Dynamic Quantization Algorithm Of Convolutional Neural Networks And Its Parallel Computing Structure
9	Hardware Accelerator Design Of Convolutional Neural Networks For Low Power And High Performance
10	Research On FPGA-based RTL-level Convolutional Neural Network Computing System