Font Size: a A A

Research On Multiscale Object Detection Algorithm Based On Deep Learning

Posted on:2021-01-18Degree:MasterType:Thesis
Country:ChinaCandidate:T C JiaoFull Text:PDF
GTID:2428330602971901Subject:Information and Communication Engineering
Abstract/Summary:PDF Full Text Request
Object detection is one of the basic problems of computer vision,and it is the basis for completing other vision tasks such as object tracking and instance segmentation.With the development of information technology,there are a large number of image and video data in human life,which makes the object detection technology play an increasingly important role in human life.It is currently mainly used in the fields of face recognition and automatic driving.The research on target detection has a long history,but traditional object detection methods have certain limitations when faced with large data samples.The development of deep learning brings new ideas to target detection,and detection using convolutional neural networks has received widespread attention.This paper researches target detection based on deep learning technology.The main tasks are as follows:First,to improve the model complexity caused by the single prediction level in the YOLO algorithm and the network deepening,the YOLO algorithm is improved.Improve from the following aspects: use deep separable convolution to reduce the amount of parameters in the calculation process,combine the residual structure to construct the inverse residual block,improve the disappearance of gradients generated during the network deepening process,the bottleneck structure is used in the inverted residual block to avoid the loss of feature information.Using a multi-scale feature fusion strategy,the feature layers after 32 times downsampling are up-sampled twice,and the feature layers corresponding to the corresponding scale are stitched and fused.The feature layers generated after the fusion continue to be upsampled twice and stitched and fused.The generated 3 levels of features at different scales are used for target prediction,which optimizes the detection effect for multi-scale targets;according to the target characteristics of the data set,clustering analysis is performed on the target prior frame to improve the applicability of the model;For negative sample imbalance,the improved model is combined with Focal loss,and compared with other target detection models,the accuracy of model detection is improved.Second,the SSD algorithm is improved to solve the problem of insufficient feature extraction capability of the basic network VGG16 in the SSD algorithm and a single connection between shallow feature layers.Improve from the following aspects: by changing the front basic network to DarkNet53,using its stronger feature extraction capability,improve the overall detection performance of the model;by introducing a feature fusion structure,the 19 × 19 and 10 × 10 feature maps are doubled The linear interpolation method has a transformation scale of 38 × 38,combined with a 1 × 1 convolution kernel to adjust the number of channels to 512,and combines the generated feature layer with the 38 × 38 feature layer after adjusting the number of channels to generate a new feature layer.Finally,the new feature layer is used as a 38 × 38 scale prediction layer in multi-scale prediction.Through this feature fusion structure,the feature information connection between shallow feature layers is strengthened,and multi-scale feature information is enriched,thereby improving the detection accuracy of the SSD algorithm model.In order to verify the effectiveness of the two improved algorithms in this paper,experimental verification of the improved algorithms is performed on commonly used object detection data sets.Among them,the INRIA pedestrian dataset was used to experimentally verify the improved YOLO algorithm.Focal loss was applied to the improved algorithm and compared with the two-stage target detection algorithm on the PASCAL VOC 2007 dataset.Finally,the improved YOLO algorithm was applied to Solder joint defect detection;The PASCAL VOC 2007 and 2012 joint datasets is used to verify the improved SSD model and compare it with other algorithm models.
Keywords/Search Tags:Object detection, Deep learning, Inverted residual block, Feature fusion, Multi-scale prediction
PDF Full Text Request
Related items