Font Size: a A A

Research On High-resolution Traffic Sign Detection Based On Multi-scale Attention Fusion

Posted on:2024-01-30Degree:MasterType:Thesis
Country:ChinaCandidate:J H LiFull Text:PDF
GTID:2542307100488644Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
As autonomous driving gradually becomes a popular project,traffic sign detection,which is one of its key technologies,has also received equal attention.Existing traffic sign detection methods mainly face the following problems:(1)Traffic sign dataset has the problem of category imbalance.(2)In complex scenes,some traffic signs with similar appearance are difficult to distinguish,and other similar objects are easily misdetected as traffic signs.(3)It is difficult to balance inference speed and accuracy for traffic sign detection under high-resolution images.Where high-resolution image refers to 2048×2048,not 4K high-resolution image,and this is distinguished by higher resolution in the following.For the imbalance of data set categories in problem(1),a semi-dynamic data enhancement method is proposed in this thesis.The method firstly uses static sorting to obtain a priori information of the dataset,and secondly uses iterative update and data enhancement to balance the number of categories while updating the a priori information to achieve dynamic enhancement.Compared with the general static data enhancement method,the enhanced data categories of this method are more balanced.For problem(2)in complex scenes,some traffic signs with similar appearance are difficult to distinguish,and other similar objects are easily misdetected as traffic signs.In this thesis,we propose a traffic sign detection model based on weak semantic segmentation and feature refinement,which is named Layer-by-layer Feature Refinement Detector(LFRD).The model proposes a hierarchical clustering anchor frame and group loss based on traffic sign characteristics.The hierarchical clustering anchor frames are obtained by using K-means clustering algorithm to obtain the anchor frames of each layer according to the target scale,which can better adapt to the flexible scale changes of traffic signs;the grouping loss groups the categories with similar appearance and designs the loss function to guide the model to learn the subtle differences between similar traffic signs so as to reduce the misclassification.Next,a weak semantic segmentation module and a feature refinement module are added to the Neck layer.The weak semantic segmentation module learns shallow feature contextual information to segment credible and non-credible regions;the feature refinement module mines the contextual information of non-credible regions and actively learns and eliminates the interference that causes false detection to reduce the false detection of other similar objects.Finally,multi-scale feature fusion is optimized using channel attention.For problem(3),it is difficult to balance inference speed and accuracy for traffic sign detection in high resolution images.In this thesis,we propose an adaptive screening model and use it in combination with a region detection model to construct a two-stage traffic sign detection method for higher resolution images.The method first uses an adaptive region screening module constructed based on convolutional neural networks and multiscale attention fusion,which consists of four parts:backbone network,multiscale feature fusion,region feature extraction and prediction,sample strategy and loss.The performance of different lightweight backbone networks is analyzed using experiments and suitable backbone networks are selected according to requirements;multi-scale features are fused using improved channel attention;different methods of region feature extraction are analyzed according to experimental effects;multiple sample strategies and loss functions are compared using the same variable experiments.The adaptive filtering model filters out the regions where traffic signs do not exist in the higher resolution image,and then uses the region detection model to detect traffic signs in the remaining regions,and finally maps the detection results back to the original image and filters out the repeatedly detected traffic signs using the non-maximal suppression algorithm NMS.In this thesis,the above methods and models are tested and experimented on public datasets based on the Pytorch deep learning framework.The results show that the proposed LFRD model can focus on the subtle disparity between similar traffic signs and reduce the interference caused by other objects similar to the traffic signs,and can achieve a higher accuracy rate compared with existing schemes.At the same time,the two-stage traffic sign detection model proposed in this thesis for higher resolution images can well balance the contradiction between real-time and accuracy caused by higher resolution images,and achieve a reduced impact on detection accuracy while accelerating the inference speed.
Keywords/Search Tags:traffic sign detection, weak semantic segmentation, feature refinement, multiscale attention fusion, high resolution images
PDF Full Text Request
Related items