Research On Accelerated Optimization Of Object Detection Algorithm Based On Heterogeneous Intelligent Computing Platform

Posted on:2023-12-02

Degree:Master

Type:Thesis

Country:China

Candidate:Y M Wang

Full Text:PDF

GTID:2558307163989739

Subject:Computer technology

Abstract/Summary:

PDF Full Text Request

The rapid development of deep learning target detection algorithms,and the foreseeable end of Moore’s Law will lead to a slowdown in the processor process.Therefore,to fully exploit the computing power of existing intelligent computing platforms has become a realistic choice.Based on the heterogeneous intelligent computing platform composed of CPU and Cambricon MLU270,this dissertation proposes a flexible deployment optimization acceleration strategy for YOLOv5 target detection algorithm with characteristics of efficient inference and low m AP(mean Average Precision)loss.By fully analyzing the characteristics of the YOLOv5 algorithm and the above software-hardware platform,YOLOv5 algorithm is first adapted to the platform by operator concatenation and realization.Further,a two-level acceleration strategy at the algorithm level and the system level is further carried out.At the algorithm level,the network structure is adjusted by replacing its FOCUS module,and so the platform characteristics can be more effectively exploited;the network slimming method is implemented to reduce the amount of weight parameters;a channelby-channel INT8 fixed-point quantization is adopted to further reduce the parameter bit width.At the system level,task balancing is utilized to maximize MLU utilization rate;two time-consuming operators are accelerated at Bang C level.Further,TFU fusion,graph optimization,and address optimization are simultaneously performed at the stage of offline model generation.Besides,The CPU/MLU three-stage pipeline design is designed and utilized at the stage of offline inference.We actually evaluate the effect of the above strategies.The inference speed of the optimized YOLOv5 s on the Microsoft COCO 2017 dataset has reached 736FPS(Frame Per Second),which is more than 70 times higher than the baseline without optimizations.The m AP loss is less than 1%,and the inference speed on the private dataset even reaches853 FPS.Last but not least,an efficient YOLOv5 offline model was successfully deployed and applied to an aerial reconnaissance images recognition system,and the experimental results show that for super large aerial images of size being 7500*50000,the inference speed reaches 0.25 FPS when m AP is 0.883,which laid a solid technical foundation for the application of special scenarios.

Keywords/Search Tags:

MLU, Object detection algorithms, YOLOv5, Optimization, Acceleration

PDF Full Text Request

Related items

1	Study Of Object Detection Algorithm Based On YOLOv5
2	Improved Object Detcetion Algorithm Based On YOLOv5
3	Research On Tiny Object Detection And Recognition Algorithm Based On YOLOv5
4	Research On Person Re-identification Method For Security Scenes Based On YOLOv5 And DeepSORT
5	Research On Low-Light Object Detection Method Based On Improved YOLOv5
6	Research And Application Of Small Object Detection Algorithm Based On YOLOv5
7	Research And Implementation Of Object Detection Acceleration Method Based On FPGA
8	Incremental Object Detection Based On YOLOv5 And EWC Model
9	Research On Small Target Detection Model Based On Optimized YOLOv5
10	Research On Improving The Accuracy Of YOLOv5 Object Detection Using Attention Mechanism