Optimization And Implementation Of YOLOv3 Model Based On FPGA

Posted on:2021-12-20

Degree:Master

Type:Thesis

Country:China

Candidate:Y X Du

Full Text:PDF

GTID:2518306050466444

Subject:Circuits and Systems

Abstract/Summary:

PDF Full Text Request

With the innovation of artificial intelligence technology and the improvement of hardware level,target detection technology has developed rapidly in recent years,and deep learning-based detection algorithms have emerged endlessly and gradually applied to all aspects of life.However,almost all of the commonly used object detection algorithms currently have high computational complexity and need the help of high-performance computers to complete the task.Many application scenarios for target detection,such as unmanned driving,navigation guidance,and traffic supervision,require algorithms to be deployed on mobile devices,and mobile devices often do not have high-performance computing capabilities,leading to better results.The deep learning target detection algorithm is difficult to deploy on mobile devices.The research topic of this article is to carry out targeted algorithm optimization and hardware implementation based on the computing characteristics of FPGA and the advantages and disadvantages of YOLOv3 target detection algorithm to fill the gaps in mobile terminal target detection algorithms.The main research contents and results are as follows:1.Aiming at the characteristics of convolution operation,the advantages of parallel computing of FPGA are used to design a convolution multiplication array in FPGA.Uniform input and output data bit width,and call more than 2,000 DSP units at the same time for multiplication calculations,to achieve highly parallel calculations,the operation speed is 256 times the single convolution operation.2.Aiming at the problem that the convolution operation data is called too many times,a three-row shift register is used to buffer the data,so that each pixel only needs to be read once to complete all convolution operations.This design method makes the number of pixel reads is only one-ninth of the original pixel reads.3.To address the problem of too much cache data in the target detection algorithm,when using high-performance computer calculations,up to 16 G of memory resources can be used to cache all operational data,but the FPGA does not have such a large memory resource,so a special RAM interaction scheme is designed in this paper.Only use more than one hundred Mbit of RAM resources to complete the operation of the entire algorithm and output the detection results.4.The YOLOv3 algorithm includes a feature extraction part and a three-scale detection part.In this paper,a pipeline design method is used reasonably,which effectively improves the algorithm operation speed and reduces the operation time to half the original operation time5.According to the designed FPGA algorithm architecture,an algorithm optimization strategy is proposed.First,when training the YOLOv3 algorithm model,add regularization constraints to the BN layer parameters,so that more BN layer parameters change in the direction of 0,and the network is trimmed with a lower impact on accuracy.Secondly,in order to save the amount of model parameters,finite bit quantization is performed on the model parameters,and the model parameters are reduced as much as possible while sacrificing a little accuracy.Finally,referring to the existing network cutting methods,a new network cutting method that fits the hardware design of this paper is proposed.These three methods are used to optimize the model and deploy it on the FPGA,which greatly improves the operation speed of the algorithm.

Keywords/Search Tags:

FPGA, Convolutional Neural Network, YOLOv3, Model optimization

PDF Full Text Request

Related items

1	Design Of YOLOv3-Tiny Algorithm Based On FPGA
2	Research On Optimization Technologies Of FPGA-based Convolutional Neural Network Implementation
3	Research On The Acceleration And Optimization Method Of Convolutional Neural Network
4	A Convolutional Neural Network Accelerator Based On FPGA
5	Research On Mask Wearing Detection Model Based On Improved YOLOv3
6	Research On Person Counting Method Based On Improved YOLOv3
7	Implementation And Research Of FPGA-based Convolutional Neural Network Accelerator
8	Research On Target Recognition Algorithm Based On Convolutional Neural Network
9	Research On Target Localization Based On Neural Network
10	Research On Convolutional Neural Network Optimization Based On FPGA Cluster Heterogeneous Platform