Font Size: a A A

Design And Implementation Of CNN Object Detection System On Embedded Platform

Posted on:2020-11-24Degree:MasterType:Thesis
Country:ChinaCandidate:M J YanFull Text:PDF
GTID:2428330596476793Subject:Engineering
Abstract/Summary:PDF Full Text Request
Object detection is a research hotspot in the field of Computer Vision,which has a very broad application prospect in many fields such as Intel igent Transportation,Intel igent video surveillance and Aerospace.At present,the object detection based on Convolutional Neural Network surpasses the traditional object detection algorithm by an absolute advantage.However,due to the complex structure and excessive computation of Convolutional Neural Network,it is difficult to realize real-time application in the embedded platform with insufficient resources and low power consumption.Therefore,at present,laser,radar and other sensors are often used in industry to achieve object detection,which is relatively expensive.As one of the real-time object detection models with the best performance at present,YOLO has simple structure and fast detection speed.Compared with other Convolutional Neural Network models,it is more suitable for the embedded platform with low power consumption.In addition,by analyzing the advantages and disadvantages of GPU,ASIC and FPGA in power consumption,price and other aspects,the ARM+FPGA hardware architecture's Zynq-7020 embedded platform is selected to design and implement the object detection system based Tiny-YOLO.The main content of this thesis includes:1.Weigh the calculation latency and the accuracy losses,8-bit integer quantized inference scheme is proposed to effectively aleviate the problem of i insufficient computing resources and storage resources of the embedded platform.The integer multiplication operation is used to simulate the floating-point multiplication's operation effect,and optimize the activation function to reduce the time to quantize the activations.Besides,the corresponding quantized training schme is proposed to reduce the precision loss of the weights.Compared with the floating-point training and inference schme,the model size is compressed by 75% and the calculation speed is increased by 2-3 times.2.According to the structural characteristics of Convolutional Neural Networks,the hardware and software partitioning of calculation is carried out.The parallel computing IP core of Convolutional layer and Pooling layer is designed in Programmable Logic(PL).The Cortex-A9 Processing System(PS)implements the Softmax layer and NonMaximum Suppression algorithms.Through Convolutional layer,Pooled layer parallel computing feature and Tiny-YOLO model's data volume and calculation amount analysis,under the Zynq-7020 resource limitation condition,the data storage,segmentation and calculation methods of the embedded platform are designed to complete the parallel acceleration of the Convolutional layer and the Pooling layer algorithm.Compared with the PS end serial processing,the detection speed is increased by about 300 times.3.The hardware architecture of the target detection system is built on the Zynq-7020 platform,and the software operation environment of the object detection system is built.Based on the above work,the real-time object detection of the Tiny-YOLO model is realized through the software and hardware collaborative design.This thesis designs the object detection system based on the embedded platform,weighs the speed and accuracy,and finally realizes the low-cost,high-real-time object detection system.The hardware simulation in the SDSoC development environment shows that the total on-chip power is only 2.9W,and the object detection speed reaches 23 FPS,which meets the application requirements of real-time object detection.
Keywords/Search Tags:Convolutional Neural Network, Object detection, Zynq embedded platform, real-time system
PDF Full Text Request
Related items