Font Size: a A A

Design And Implementation Of Deep Learning Accelerators For Object Recognition Tasks

Posted on:2021-10-21Degree:MasterType:Thesis
Country:ChinaCandidate:R HuangFull Text:PDF
GTID:2518306494996749Subject:Software engineering
Abstract/Summary:PDF Full Text Request
Nowadays,target recognition algorithms are still a popular direction in the field of computer vision research,but the application of target recognition algorithms faces huge challenges.Most of the existing target recognition algorithms are applied on GPU platforms with high power consumption and high performance to meet real-time requirements.For some specific scenarios where power consumption and resources are limited,it becomes very difficult to deploy target recognition algorithms.In order to solve this problem,some researchers optimize the target recognition algorithm,reduce the parameter amount of the target recognition algorithm and reduce the calculation amount of the target recognition algorithm.This type of algorithm is called the lightweight target recognition algorithm.However,lightweight target recognition algorithms still have limitations in power consumption and resources.Aiming at this limitation,this paper designs a neural network accelerator based on lightweight target recognition algorithm.The architecture design of the accelerator is based on an FPGA hardware platform that meets power consumption and resource requirements.At the same time,a software-hardware collaborative design solution is proposed,which makes the architecture scalable and tailorable,and can be better applied in various specific scenarios in.The neural network accelerator in this paper designs the core operator modules Conv2 d,PW,DW,RELU,Reshape,etc.according to the structure of the lightweight target recognition algorithm.Through quantitative compression of the model algorithm and operator optimization for the hardware platform,the lightweight target recognition algorithm can achieve real-time effects.At the same time,the architecture is based on instruction-based design technology,which makes the architecture more generalized.It not only supports target recognition algorithms with lightweight modules,but also supports target recognition algorithms without lightweight modules.In order to test the performance of the accelerator,the YOLO-Tiny(without lightweight module)algorithm and the Mobilenet V3&&YOLOV3(with lightweight module)algorithm are configured through the host computer.The experimental results show that the YOLOTiny algorithm can reach 80 FPS on the accelerator,the optimized Mobilenet V3&&YOLOV3 The algorithm can reach 100 FPS on this accelerator.
Keywords/Search Tags:hardware platform, model compression, lightweight neural network, FPGA
PDF Full Text Request
Related items