Font Size: a A A

Multi-level Semantic Information Adaptation For Semantic Image Segmentation

Posted on:2021-04-23Degree:MasterType:Thesis
Country:ChinaCandidate:L Y LiuFull Text:PDF
GTID:2518306548482884Subject:Information and Communication Engineering
Abstract/Summary:PDF Full Text Request
In recent years,semantic segmentation algorithm based on deep learning has played an important role in the field of computer version,and its application fields are becoming more and more extensive,such as road scene understanding in autonomous driving systems,landing point localization in unmanned aerial vehicles,medical image analysis,and so on.Semantic segmentation algorithm is aimed to classify the pixels according to their semantic meanings,so that each pixel in the image has a corresponding classification label.At present,semantic segmentation algorithm mainly has two breakthroughs.The first is Deep Lab series.It creatively connects multiple dilated convolutions in parallel with different dilated rates.In this way,the feature maps can obtain full receptive field information.But excessive use of dilated convolutions will lead to the loss of local information,which is not conducive to the segmentation of small objects and local details in the image.In order to solve this problem,a dilated convolution filling network(DCFNet),which can not only get rich receptive field information but also make full use of neighborhood pixels,is proposed in this thesis.Besides,a lowresolution branch and a light-weight feature pyramid module are designed.The proposed DCFNet combines multi-level semantic information through the appropriate channel ratio at the decoder,which enhance the recognition of large and small objects in the image.The second important development is the application of attention mechanism in semantic segmentation.This mechanism gives different attention weights to different pixels in the image,so that the network can focus on the parts that need more attention.However,the feature fusion operation of attention mechanism on input and output is not learnable.In order to solve this problem,a threshold attention network(TANet)is proposed.It can learn different weights to the input and output of the attention module.By training the network,the most suitable weights for the current task can be learned.In addition,the threshold mechanism is designed to give different weight thresholds to different levels in the network.By fully combining the semantic information of every level of the network,more accurate segmentation results are obtained.Both DCFNet and TANet are suitable for multi-level semantic information fusion,which makes the output of the network contain rich semantic information.Besides,the two networks have achieved high segmentation accuracy in Cityscapes datasets and the more complex ADE20 K datasets.Compared to DCFNet,TANet is much more flexible in the setting of module parameters.The modules in TANet can be further applied in DCFNet.
Keywords/Search Tags:Semantic segmentation, Convolutional neural network, Dilated convolution, Attention mechanism, Encoder-decoder framework
PDF Full Text Request
Related items