Research On Transformer-Based Mask-Level Semi-Supervised Semantic Segmentation

Posted on:2024-05-15

Degree:Master

Type:Thesis

Country:China

Candidate:J R Zhang

Full Text:PDF

GTID:2568307064985479

Subject:Computer Science and Technology

Abstract/Summary:

PDF Full Text Request

Semantic segmentation,as a fundamental task in computer vision,has become an important solution in many application fields,such as autonomous driving scene analysis and medical lesion image analysis.The training of strongly supervised semantic segmentation methods requires a large number of image annotations,while the pixel-level annotation of semantic segmentation is time consuming.To solve this problem,semi-supervised semantic segmentation has become a popular research area.Semi-supervised semantic segmentation utilizes a limited number of annotated samples,a large number of unlabeled images,and pseudo-label generation to train neural network models,which reduces the cost of labeling semantic segmentation labels.The current mainstream deep learning semantic segmentation approaches are to extract image features and complete pixel classification by convolutional neural networks.Recent studies have shown that the semantic segmentation method based on Vision Transformer outperforms convolutional neural networks.Meanwhile,the current semi-supervised semantic segmentation methods based on pixel-level contrastive learning have problems such as large computational costs and difficult sampling,which affect the effectiveness of pixel-level classification.To solve the above problems,this paper proposes a Transformer-based mask-level semi-supervised semantic segmentation method,and the main contributions are as follows:(1)To address the problem that the Transformer-based architecture requires a large amount of training data,this paper designs a pre-training method for the Transformer-based semantic segmentation decoder.Semantic segmentation annotations are generated by the attention matrix of Vision Transformer backbone networks in a self-supervised fashion on large-scale datasets,and the generated annotations are used to pre-train the Transformer-based decoder.(2)To address the problems of pixel-level contrastive learning in semi-supervised semantic segmentation,this paper designs a mask-based contrastive learning method,which includes Mask Contrastive(MC)loss and Mask Feature Contrastive(MFC)loss.The MC loss first divides the semantic segmentation annotations into multiple masks according to categories,and then completes the contrastive learning among the masks of different categories,which solves the problems caused by the wrong negative samples in the pixel-level contrastive learning.MFC loss uses masks to obtain features of each category and completes contrastive learning between category feature representations,avoiding the use of extra storage space to obtain category features and solving the problem of sampling difficulties.(3)In this paper,an Ensemble Mask Consistency(EMC)loss is designed.Different from the masks divided by semantic segmentation labels,the masks predicted by the neural networks might have multiple masks corresponding to the same category,so this paper combines the masks that represent the same category in the prediction process and completes the consistency regularization with the semantic segmentation annotation.(4)This paper conducts experiments on two widely used Pascal VOC and Cityscapes datasets.Experimental results show that both the self-supervised pre-training method and the mask-level semi-supervised semantic segmentation method proposed in this paper outperform the current state-of-the-art algorithms on publicly available datasets.Meanwhile,a large number of ablation experiments prove the effectiveness of the proposed methods.

Keywords/Search Tags:

Semi-supervised Learning, Semantic Segmentation, Vision Transformer, Contrastive Learning

PDF Full Text Request

Related items

1	Research On Image Segmentation Method Based On Self-training And Contrast Learning
2	Research On Limited Supervised Learning In Computer Vision
3	Supervised And Semi-supervised Methods For Medical Image Semantic Segmentation
4	Research On The Application Of Geometric Information In The Semi-supervised Learning
5	Design And Research Of Semantic Segmentation Algorithm For Weakly Supervised Image Based On Semantic Affinity Among Pixels
6	Research On Semi-supervised Graph Learning Algorithm Based On Label Augmentation And Contrastive Learning
7	Robust Semantic Segmentation Model Based On Weakly Labeled Data
8	Research On Decoupling-Training-Based Semi-Supervised Image Semantic Segmentation Algorithm
9	Research On Semi-supervised Semantic Segmentation Algorithms Based On Deep Learning
10	Research On Weakly-supervised Learning Based On Sample Selection Strategy And Contrastive Learning