| Object detection has long been one of the important research directions in computer vision,whose main task is to find out the objects in images and mark the location of the targets.Object detection has promoted the application and development of artificial intelligence in real life,playing an important role in such fields as autonomous driving,intelligent medical care,agricultural pest identification,and national defense and military.With the rapid development of deep learning-related technologies and the support of massive amount of tagging data,the recognition accuracy of object detection continues to improve,and its performance in some aspects is far better than human.However,in some specialized segments,such as rare plant and animal identification,rare lesion detection,and defect detection,tagging data is rare and tagging cost is unacceptable.Data tagging has become a bottleneck for target detection in these fields,which urgently needs to be addressed in a less costly way.To address this challenge,many researchers have combined few-shot learning with object detection and proposed few-shot object detection methods to improve detection accuracy.However,three important issues remain to be solved in this process:(1)the lack of labeled data leads to insufficient deep features extracted to identify the target object,which makes the recognition accuracy difficult to meet the practical application requirements;the model overfitting prevalent in few-shot data analysis causes insufficient generalization ability of the detection model;(2)the target image scaling in the recognition process leads to information loss,making it difficult to obtain accurate detection results because the image features are smoothed;(3)insufficient focus of the model on the target of the input data leads to insufficient key information extracted.In order to solve the above three problems,this paper further accomplishes the following two aspects on the basis of Meta-Yolo model:(1)Proposed a convolutional neural network with multi-scale feature enhancement: In the framework of the original Meta-YOLO few-shot object detection model,a deepened feature extraction network is designed for its insufficient deep feature extraction to obtain higher-level semantic information.And due to the few amount of data,the residual jump structure is added to several convolutional operation blocks in the improved convolutional neural network to alleviate the overfitting or underfitting phenomenon that occurs in the deepened network.To address the problem of varying input image sizes affecting model performance,an improved spatial pyramidal pooling structure is introduced to aggregate features with different scale sizes.(2)Proposed a few-shot object detection model based on attention mechanism:To address the problem of insufficient attention to key detail information in few-shot object detection models,we propose a two-stage attention mechanism for feature extraction and recognition.In the feature extraction stage,we introduce a SENet module that incorporates a residual jump structure to learn the correlation between features to enhance the expressiveness of the model network;in the recognition stage,we use a CBAM module to make the network pay more attention to the detail information of the target in the image when recognizing the target,so as to improve the detection capability of the model for few-shot objects.The performance of the few-shot object detection model in terms of critical detail information processing is significantly improved by the twostage attention mechanism.The ablation experiments and comparative experiments on the VOC2007 and VOC2012 data sets show that,compared with the benchmark model,the m AP index of the multi-scale feature enhancement method based on CME is increased by 4.75% on average,and the feature extraction and recognition two-stage attention mechanism method averages It has increased by 1.5%,which proves the effectiveness of the two methods proposed in this paper. |