Design And Implementation Of Deep Learning Accelerators For Object Recognition Tasks

Posted on:2021-10-21

Degree:Master

Type:Thesis

Country:China

Candidate:R Huang

Full Text:PDF

GTID:2518306494996749

Subject:Software engineering

Abstract/Summary:

PDF Full Text Request

Nowadays,target recognition algorithms are still a popular direction in the field of computer vision research,but the application of target recognition algorithms faces huge challenges.Most of the existing target recognition algorithms are applied on GPU platforms with high power consumption and high performance to meet real-time requirements.For some specific scenarios where power consumption and resources are limited,it becomes very difficult to deploy target recognition algorithms.In order to solve this problem,some researchers optimize the target recognition algorithm,reduce the parameter amount of the target recognition algorithm and reduce the calculation amount of the target recognition algorithm.This type of algorithm is called the lightweight target recognition algorithm.However,lightweight target recognition algorithms still have limitations in power consumption and resources.Aiming at this limitation,this paper designs a neural network accelerator based on lightweight target recognition algorithm.The architecture design of the accelerator is based on an FPGA hardware platform that meets power consumption and resource requirements.At the same time,a software-hardware collaborative design solution is proposed,which makes the architecture scalable and tailorable,and can be better applied in various specific scenarios in.The neural network accelerator in this paper designs the core operator modules Conv2 d,PW,DW,RELU,Reshape,etc.according to the structure of the lightweight target recognition algorithm.Through quantitative compression of the model algorithm and operator optimization for the hardware platform,the lightweight target recognition algorithm can achieve real-time effects.At the same time,the architecture is based on instruction-based design technology,which makes the architecture more generalized.It not only supports target recognition algorithms with lightweight modules,but also supports target recognition algorithms without lightweight modules.In order to test the performance of the accelerator,the YOLO-Tiny(without lightweight module)algorithm and the Mobilenet V3&&YOLOV3(with lightweight module)algorithm are configured through the host computer.The experimental results show that the YOLOTiny algorithm can reach 80 FPS on the accelerator,the optimized Mobilenet V3&&YOLOV3 The algorithm can reach 100 FPS on this accelerator.

Keywords/Search Tags:

hardware platform, model compression, lightweight neural network, FPGA

PDF Full Text Request

Related items

1	Model Compression And Hardware Acceleration Of Convolutional Neural Networks
2	Research On The Compression And Hardware Acceleration Based On Convolutional Neural Network
3	Research On Lightweight Neural Network Data Compression Coding For Vision Terminal
4	Research And Implementation Of FPGA Accelerating Compressed Convolutional Neural Network
5	Research On Compression Method Of Deep Neural Network Model Oriented To Person Re-identification
6	Research On Memory Bus Width Aware Compression Technology Of Image Super-resolution Model Algorithm Based On FPGA
7	Design And Implementation Of Lightweight Convolutional Neural Network Accelerator On SoPC
8	High Performance Artificial Intelligence Computing With Algorithm-hardware Co-design
9	Design And Optimization Of Shifted Convolutional Neural Network Based On FPGA Platform
10	Research On CNN Network Acceleration For Image Classification Based On FPGA