Target Detection Accelerator Design Based On FPGA

Posted on:2023-06-01

Degree:Master

Type:Thesis

Country:China

Candidate:Z Gao

Full Text:PDF

GTID:2568307025976899

Subject:Control Engineering

Abstract/Summary:

PDF Full Text Request

In recent years,Convolutional Neural Networks(CNNs)based object detection algorithms have been widely used in many fields such as industry,agriculture and military.For adjusting complexity application scenarios,the network structure of algorithms gradually complexed,and the number of network parameters and calculations increased rapidly,which increase the difficulty of deploying object detection algorithms into low-power embedded devices.Therefore,this dissertation adopts a programmable and energy-efficient Field Programmable Gate Array(FPGA)to design a low-power,high-throughput object detection algorithms hardware accelerator for implementing efficient object detection system in embedded devices.Firstly,this dissertation conducts the system research of object detection accelerators.The dissertation analyses the number of parameters,structure and speed of several object detection models by conducting the Roofline model of Xilinx ZCU104 platform.According to the estimated computational efficiency of each algorithm on the platform,the dissertation selects YOLO v2 to finish the design of accelerator.Furthermore,the dissertation optimizes the YOLO v2 algorithm for FPGAs,simplify the forward inference process of the algorithm by layer fusion,and quantize the network parameters by using dynamic fixed-point representation to alleviate the computational and storage pressure caused by floating-point data and operations.The dissertation conducts a study on convolutional loop optimization,and determine the loop unfolding method of fusing input feature map channels with output feature map channels and the loop exchange method of reusing output feature map data,which improves the parallelism and computational efficiency of the target detection system.Finally,the dissertation designed a hardware/software co-processing mechanism and system architecture based on on-chip heterogeneous computing.The computation of different layers is conducted by ARM or FPGA according to their advantage.And the data path of accelerator system is designed.Secondly,an efficient and universal accelerator IP core is designed to accelerate arbitrary computational layers of YOLO v2.Specifically,the dissertation exploits the methods of loop unrolling,loop blocking,ping-pong buffering and multi-channel data transmission to design the functional modules of accelerator IP core.Therefor the data transmission and delay are optimized,and the throughput of the accelerator is improved.Finally,this dissertation leverages high-level synthesis tools to optimize the implementation of the accelerator IP core.And completing the Block Design of the object detection system in Vivado.Moreover,the bitstream,weight and other files are imported into the FPGA to implement the corresponding functions on the Programmable System(PS)and integrates a complete object detection system.Thirdly,this dissertation conducts experiments on the Xilinx ZCU104 platform to verify the correctness of the functions designed,and analyzes the power consumption and performance.The experimental results show that the object detection accelerator designed in this dissertation can achieve a throughput of 28.3 GOPS and an energy efficiency of 7.1 GOPS/W with a power consumption of 3.98 W.The energy efficiency is 108(23.48)times of the YOLO v2 operation using the CPU(GPU).The comparison with other related researches shows that the CNN accelerator designed in this dissertation achieved comparable results on throughput and power consumption,and meets the requirements of embedded platform application scenarios.

Keywords/Search Tags:

FPGA, object detection, CNN, hardware accelerator

PDF Full Text Request

Related items

1	Real-time Multi-object Detection And Tracking Algorithm Based On FPGA
2	Implementation Of Moving Object Detection Base On FPGA
3	Research And Implementation Of High-speed Object Detection Network Based On FPGA Accelerator
4	Implementation And Application Of Hardware Accelerator Based On Image Recognition Technology
5	Compilation Optimization And Hardware Acceleration Of Object Detection Algorithm Based On Regional Proposal Network
6	Research On FPGA-based Accelerator For Object Detection Neural Network
7	Research On Lightweight Convolutional Neural Network Algorithm And Hardware Collaborative Acceleration Technology
8	Research On Implement Of DMC Controller With Hardware Accelerator
9	Design Of Energy-Efficient Object Detection System Based On FPGAs
10	Research On Neural Network Accelerator Customization Method For Large-scale Reconfigurable Hardware