Font Size: a A A

Label Semantics And Transformer For Meta Learning Few-shot Object Detection

Posted on:2022-09-04Degree:MasterType:Thesis
Country:ChinaCandidate:Y M XiongFull Text:PDF
GTID:2518306602994029Subject:Master of Engineering
Abstract/Summary:PDF Full Text Request
The object detection method based on few-shot learning aims to use a few annotated data to make the object detection model achieve better detection results.This thesis mainly focuses on the Few-shot object detection via feature reweighting(FODFR),a few-shot object detection method.FODFR is a few-shot object detection method which combines the fewshot learning method of meta learning with YOLOv2 object detection model.In this thesis,based on FODFR,we make full use of the semantic correlation between category labels,and fully mine the information in the image to improve the object detection effect with only a small number of labeled samples.The main work is as follows:(1)A few-shot object detection method based on spatial position information and feature reweighting is proposed.Aiming at FODFR's feature reweighting,it only adjusts the channel dimensions of features,ignoring the problem of the target spatial information.This method uses meta features containing spatial position information to adjust the spatial dimensional of feature to enhance the feature discrimination of the target area.Secondly,when the feature extractor of FODFR fuses the shallow features into the deep features,there is a problem that the target integrity of shallow features is destroyed.This method uses multiple shallow features of different scales,using convolution and pooling operations to change the size of the shallow features,and then they are fused into the deep features to retain the target integrity of the shallow features.The ablation experiment on the PASCAL VOC data set shows the effectiveness of this method.The comparative experiment shows that this method is better than the original FODFR method.Under the condition of 10 shot,mAP is increased by 1.7%.(2)When learning the meta features of the base class and the new class,FODFR does not make full use of the association between the base class and the new class category,which leads to poor meta features expression ability of the new class,a few-shot object detection method based on label semantic and feature reweighting is proposed.This method first uses the semantic information of the category label to calculate the correlation between the base class and the new class category,and then uses the graph convolutional network to transfer the meta features of the base category to the new category according to the degree of relevance between the base category and the new category,so that the better meta features of the new class can be learned when there are only a few new class samples.Compared with FODFR and other methods,the experiment shows that this method improves the detection effect of new classes.Under the condition of 10 shot,mAP is improved by 0.5%compared with the FODFR method.(3)Aiming at the problem of low detection accuracy in FODFR due to the lack of new class samples and insufficient supervision information,a few-shot object detection method based on Transformer and feature reweighting is proposed.This method designs a new position encoding method that cuts the two-dimensional image features into blocks and convert them into the form of one-dimensional sequences,and then uses the self-attention mechanism of the Transformer model to fully mine the information contained in the image itself,then different attention is given to each feature block to highlight the key areas in the sample features,it enhances the discrimination of features and improves the detection effect.Comparative experiments on the PASCAL VOC data set show that this method is better than the FODFR method.Under the condition of 10 shot,mAP is increased by 1.9%.
Keywords/Search Tags:Few-Shot Learning, Object Detection, Deep Learning, Graph Convolution Networks, Transformer
PDF Full Text Request
Related items