Font Size: a A A

Research On Object Detection Method Based On Deep Learning

Posted on:2024-07-24Degree:MasterType:Thesis
Country:ChinaCandidate:H R WangFull Text:PDF
GTID:2568307142452304Subject:Computer technology
Abstract/Summary:PDF Full Text Request
The object detection task is to classify and localize the objects in images or videos.Most of the previous object detection tasks use spatial and channel information to build feature optimization algorithms,which in turn can extract more accurate feature and improve detection accuracy.Therefore,how to mine the features in spatial and channels is the core technical aspect of target detection algorithms.In this paper,two feature extraction enhancement algorithms are constructed and can be transplanted to any object model to improve the detection accuracy.First,Dual Channel Space Interdependent Network is proposed to extract information-dependent features in space and channels,and the key is to obtain the distribution of important information on the maximum and average features in space and channels.Although the above methods have great advantages in terms of accuracy improvement,they still cannot avoid the problem of speed reduction caused by increasing the complexity of the network.In order to optimize the detection accuracy and computational efficiency of the network,this paper constructs the Asymmetric Weight-sharing Convolution Network to reduce the number of model parameters and computational effort by using asymmetric convolution,thus improving the network operation speed.It also uses the same asymmetric convolution kernel for joint training to achieve weight sharing,which greatly enhances the robustness of convolution kernel parameters,thus achieving both accuracy and speed improvement.This algorithm uses YOLOv4,YOLOv5,and EfficientDet as the experimental baselines,and the experimental results show the highest accuracy growth of 1.98% mAP and 2.6% mAP on the PASCAL VOC and MS COCO datasets,respectively,and the highest FPS growth of 9.3 by adding Asymmetric Weight-sharing Convolution Network.Second,the emergence of self-attention mechanism provides new clues for feature extraction in object detection tasks,but most existing self-attention mechanisms focus on extracting correlations between global and local information in space or channels,and it is still a challenge to fuse these features effectively.To this end,this paper proposes a Pooling and Global Feature fusion Self-attention Mechanism to capture the correlation between fused features for adaptive fusion among features.Pooling and Global Feature fusion Self-attention Mechanism consists of three components: Spatial Self-attention Pooling Fusion Module,Channel Self-attention Pooling Fusion Module,and Spatial and Channel Global Self-attention Fusion Module.Among them,Spatial Self-attention Pooling Fusion Module extracts the global maximum pooling and global average pooling self-attention features fused in space,while Channel Self-attention Pooling Fusion Module extracts the global maximum pooling and global average pooling self-attention features fused in channel,and Spatial and Channel Global Self-attention Fusion Module extracts the global spatial and channel fused feature relations.Finally,the three fused feature relations are appended to the original features,thus achieving the effect of enhanced feature extraction.To evaluate the effectiveness of the proposed algorithm,YOLOv4,YOLOv5 and Efficient Det are used as detection baselines,and experiments are conducted on the PASCAL VOC and MS COCO datasets,respectively,and the accuracy increases by 2.58% mAP and 2.6% mAP on the two datasets,respectively.The experiments demonstrate that the proposed method has a great improvement in object detection accuracy.
Keywords/Search Tags:two-branch channel space dependence, asymmetric convolution, weight sharing, multi-level self-attention mechanism feature fusion, object detection
PDF Full Text Request
Related items