Font Size: a A A

Research On Weakly Supervised Semantic Segmentation Method Based On Activation Modulatio

Posted on:2024-03-28Degree:MasterType:Thesis
Country:ChinaCandidate:D S ShiFull Text:PDF
GTID:2568307106475894Subject:Electronic information
Abstract/Summary:PDF Full Text Request
Semantic segmentation is a very important and basic research direction in the field of computer vision.This task uses computer feature expression to simulate the human recognition process of images,and assigns a semantic category label to each pixel of a given image.In recent years,with the rapid development of deep learning technology,semantic segmentation technology has made considerable development and progress.However,it is time-consuming and labor-intensive to obtain pixel-level annotation data for training segmentation networks.Weakly supervised semantic segmentation technology based on image-level annotation has attracted extensive attention from researchers because it only relies on image category labels to achieve semantic segmentation tasks.The main challenges of weakly supervised semantic segmentation under current image-level annotations lie in the fact that the foreground area is too small due to the localization characteristics of the classification network,and there is too much noise in the background area during the optimization of the class activation map.From the perspective of feature activation value modulation,this paper explores and researches weakly supervised semantic segmentation methods.The research contents are as follows:(1)Aiming at the problem that the foreground object area in the class activation map is too small,the first work of this paper proposes a weakly supervised semantic segmentation method based on adaptive activation enhancement.This method mainly focuses on low-confidence regions in the class activation map,and three modules are designed,including activation enhancement module,scale adaptation module and denoising module.The activation enhancement module uses an exponential function to redistribute the attention values learned by the attention module to increase the foreground activation area.The scale adaptive module uses the spatial attention model to learn multiple attention weights,and adaptively fuses multiple feature maps.The denoising module uses the multi-scale segmentation network prediction map to optimize the pseudo-label,and sets the inconsistent area as an unknown area to further improve the quality of the pseudo-label.Experimental evaluations on two commonly used datasets show that this algorithm can achieve optimal results compared with the current advanced weakly supervised semantic segmentation methods.(2)The second work of this paper proposes a weakly supervised semantic segmentation method based on competitive modulation,which starts from the peak area of the class activation map to expand the foreground target to solve the problem of small coverage of the foreground target.This method designs a competitive modulation module to competitively regulate discriminative region peaks,forcing the classifier to accurately cover more foreground object regions.At the same time,in order to improve the reliability of pseudo-labels,a reliable region mining module is designed,which consists of three steps: background region removal,reliable pixel mining and unmined target region retention.This module can be optimized cyclically,which greatly improves the accuracy of pseudo-labeling.A large number of experimental results have proved the effectiveness of the algorithm,which has a substantial performance improvement compared with the current advanced methods.(3)Work three of this paper proposes a weakly supervised semantic segmentation method with self-attention fusion modulation for the problem of excessive background noise in the optimization process of class activation maps.First,starting from Transformer’s self-attention,an adaptive fusion module is designed,and according to the activation value of self-attention and the hierarchical relationship between self-attention,the importance of shallow self-attention is increased,and the importance of deep self-attention is weakened,so that the self-attention map after fusion reduces the activation of the background area in the process of optimizing the class activation map,thereby improving the credibility of the background area.At the same time,a self-attention modulation module is designed,and the exponential function is used to modulate the fused self-attention map to force the distance between the foreground and the background to expand,thereby obtaining a more complete and accurate foreground target area,and then obtaining a pseudo-label.Experiments prove that this method can achieve good segmentation results on both PASCAL VOC2012 and COCO2014 data sets.
Keywords/Search Tags:semantic segmentation, weakly supervised learning, convolutional neural network, transformer, activation modulation
PDF Full Text Request
Related items