Font Size: a A A

Fast Object Detection Model With Background Suppression Based On YOLOv3

Posted on:2022-03-07Degree:MasterType:Thesis
Country:ChinaCandidate:Y LiuFull Text:PDF
GTID:2518306575966179Subject:Computer technology
Abstract/Summary:PDF Full Text Request
As a fundamental subject domain which has drawn heated attention in recent years,object detection plays an important role in practical projects such as face recognition,mobile payment,digital security,robotics and autonomous driving.The object detection task is divided into two main steps: 1)finding the location of the object to be detected from a digital image and marking it using two-point coordinates;2)classifying the identified object.Most of the traditional object detection methods use manually designed feature extractors and sliding window methods,which have resulted in some drawbacks such as low accuracy,slow speed and other problems.With the rise of deep learning in recent years,deep convolutional neural network(CNN)based detection methods have partially remedied these drawbacks mentioned above.However,new problems such as complex structure,detection accuracy and speed are not easily balanced and easily misled by background elements have emerged.The main reason is that in order to maximize the extraction of semantic and location information,the current mainstream methods make extensive use of Feature Pyramid Network(FPN)for fusion of deep and shallow features to enhance detection accuracy.For example,in the classical YOLOv3 framework,the computational and parametric quantities of its FPN structure account for a very large proportion of the overall network.Thus the energy efficiency ratio is relatively low.To address above problems,this thesis proposes a stackable recalibratable feature pyramid network(SR-FPN)and an improved YOLOv3 object detection network structure.Its main innovations are: 1)designing the feature pyramid structure as a stackable structure to achieve a balance between speed and accuracy;2)using efficient additive operations instead of concatenation operations to achieve the same accuracy with a substantial increase in speed;3)optimizing the feature extraction structure of YOLOv3 to make it more conducive to deployment on edge devices and facilitate practical implementation.On the public dataset Pascal VOC,the proposed method achieves a 140% runtime speedup and a 91% parameter reduction in exchange for sacrificing 2 points of m AP accuracy.Experiments have shown that,with the method proposed in this thesis,a better energy-efficiency ratio performance was presented on mainstream public datasets for target detection.The current improved model effect has a good improvement in both accuracy and speed,but still cannot cope well with background interference.Therefore,based on the above improvements,this thesis further proposes an improved cutout data enhancement for the preprocessing stage.Compared with the traditional direct cutout method,the improved method no longer uses the same probability,but uses a gaussian distribution to mask the input image,so that the network focuses more on the target itself rather than deriving from the background information.For the forward propagation operation,this thesis also proposes a dropblock method with Io U regular terms to weaken the connection between the target and the surrounding information.The improvement on background interference allows the overall target detection network to be further improved in terms of detection accuracy with no side effect on the running speed at all.Experiments show that the proposed methods can be used simultaneously to reach 79.92 m AP on the Pascal VOC dataset and nearly 1 point improvement compared to the baseline network on the collected and labeled anime face dataset.
Keywords/Search Tags:object detection, feature pyramid network, energy efficiency, neural network
PDF Full Text Request
Related items