Font Size: a A A

Research On Neural Network Accelerator Based On PYNQ

Posted on:2022-04-09Degree:MasterType:Thesis
Country:ChinaCandidate:Z J ZhouFull Text:PDF
GTID:2518306353976929Subject:Electronics and Communications Engineering
Abstract/Summary:PDF Full Text Request
In recent years,the use of neural networks to process complex visual tasks has become a hot topic,such as gesture recognition,face detection,and medical imaging.In addition,with the rapid development of the Internet of Things,the demand of embedded field for highperformance visual processing is higher and higher.However,how to deploy the convolutional neural network to the resource-constrained device has become a hot topic.Due to the low power consumption and high speed,FPGA is very suitable as a choice in Internet of Things scenario.So,this article will use FPGA to design a high-parallel and low-power So C based on some optimization solutions.This article aims to research and deploy Convolutional Neural Networks,and chooses YOLOv3-tiny as the target network.Meanwhile this article also explores the possibility of parallel computing on FPGA.Since implementing floating-point operations on FPGA is wasteful of resources,this article improves on the existing quantization algorithm.This solution which is improved not only has lower computational complexity compared to the Google solution.,but also ensures network performance.Then,this article explores the possibility of mixed precision quantization,and finds a suitable hybrid quantization scheme through iterative algorithms.While reducing data storage,the mean average precision of the network is acceptable.In the part of hardware design,this article uses high-level synthesis tools,and the IP cores designed include convolution IP,batch multiplication and accumulation IP,pooling IP,and upsampling IP.After optimizing the IP core and the overall system,the design space exploration model is designed to evaluate the system delay.Based on the design space exploration model,we can find the appropriate parameters.In this research work,the overall system design will be realized on PYNQ.Meanwhile,the usage of system resources and energy consumption are shown in table.In addition,it is also compared with other similar work.After testing,the accelerator system designed in this paper is 86 times faster than ARM A9 in single frame detection,and has obvious advantages over CPU and GPU in terms of energy consumption.The design of this article meets the real-time requirements as much as possible under resource constraints,and also provides a reference for deploying neural networks on other heterogeneous platforms.
Keywords/Search Tags:Convolutional Neural Network, Hardware Acceleration, High-level Synthesis, Quantization, PYNQ-Z2
PDF Full Text Request
Related items