Font Size: a A A

Semantic Segmentation Of Codec Image Based On Multi-scale And Attention Mechanism

Posted on:2022-06-07Degree:MasterType:Thesis
Country:ChinaCandidate:Q GuoFull Text:PDF
GTID:2518306491453314Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
Image semantic segmentation aims to achieve image pixel-level classification,that is,to assign category labels to each pixel in the image.The computer divides the image into different regions through the image semantic segmentation algorithm.Each region is a type of object,and the region is labeled The category information lays the foundation for further understanding of the image.In recent years,the image semantic segmentation method based on deep learning has achieved great success.This article first studied the representative image semantic segmentation model,and improved the UNet network model on this basis.In order to improve the feature expression ability of Convolutional Neural Networks(CNNs),the channel attention mechanism and spatial attention mechanism are embedded in the UNet network to improve the average intersection ratio of segmentation and m Io U;in order to expand the receptive field,aggregate Multi-scale context information,replace the ordinary pooling layer in the UNet encoder with a hole convolutional pooling layer and add a hole convolutional pyramid layer after the decoder to improve the segmentation effect;in order to obtain global context information,the CRFs conditional random field The model is added to the Seg Net network for end-to-end training,and a clearer segmentation boundary is obtained.(1)An image semantic segmentation model based on CBAMUNet is proposed.By embedding the CBAM attention module in the jump connection of the codec UNet network,the problem of poor feature extraction ability of UNet network and poor segmentation effect is solved.The pre-trained classification model VGG16 removes the fully connected layer as the encoder and down-sampling to extract features;CBAM recalibrates the feature map output by the encoder by explicitly modeling the interdependence of channels and spaces,achieving cross-channel and spatial extraction of image features The purpose of the decoder is to sample the feature maps and fuse the CBAM-adjusted high-quality feature maps.Through comparison experiments with UNet,it is found that CBAMUNet can significantly improve the image segmentation effect,and improve the accuracy of image semantic segmentation(PA)and average intersection ratio(m Io U).(2)Aiming at the problem that smaller objects in the image are easily lost during the segmentation process,the pooling layer of the UNet encoder is replaced with a hole convolution layer,and the hole convolution pool layer with progressively increasing hole rate is used instead of standard pooling.Layer to reduce the lack of features caused by pooling operations.Through experiments,our model improves the m Io U of small targets on the Cam Vid dataset and improves the overall segmentation effect.(3)Based on the experimental results of the second part,the image semantic segmentation model Aspp UNet based on the hole convolution pyramid is proposed.The hole convolution layer with different hole convolution coefficients is cascaded to form the hole convolution pyramid module,and the module is inserted into Behind the decoder in the codec network,multi-scale feature maps are aggregated.Experiments show that the Aspp UNet model has a better segmentation effect for targets of different sizes,and effectively improves the accuracy of segmentation.(4)Aiming at the problem that the segmentation target boundary is not clear enough and the semantics are not clear,an end-to-end network model based on Seg Net With CRFs is proposed,which will have Gaussian pairwise potentials and Mean-field approximate inference.The Conditional Random Field(CRF)of the probabilistic graph model is the last layer of the Seg Net network,which makes the model have both the characteristics of Deep Convolutional Neural Networks(DCNN)and CRF,taking full account of the characteristics and consistent appearance It can perform end-to-end training on the network in a unified deep neural network,avoiding separate post-processing of the image.Experiments have proved that the Seg Net With CRFs model can make the boundary of the segmentation target clearer and obtain the semantic information of the image.more acurrate.Experiments show that the model proposed in this paper can effectively make up for the shortcomings of the codec network,improve the segmentation effect of the network,promote the development of image understanding tasks,and the application of image semantic segmentation in autonomous driving and precise maps.
Keywords/Search Tags:Image semantic segmentation, Hole convolution pooling, Hole convolution pyramid, SegNet, CBAM, UNet, CRFs
PDF Full Text Request
Related items