Font Size: a A A

Research On Image Semantic Segmentation Based On Deep Network

Posted on:2021-04-09Degree:MasterType:Thesis
Country:ChinaCandidate:X P ZhaoFull Text:PDF
GTID:2428330614971506Subject:Signal and Information Processing
Abstract/Summary:PDF Full Text Request
Image semantic segmentation is an important branch in the field of artificial intelligence,which aims to assign category labels to all pixels in an image.Image semantic segmentation technology plays a vital role in many applications,such as autonomous driving,robot control,medical imaging and fashion recommendation.With the development of deep networks,especially the emergence of fully convolutional neural networks,researchers have proposed a variety of end-to-end semantic segmentation methods.Although these methods have achieved good results,there are certain limitations.On this basis,starting from the research of attention module and multi-scale feature fusion,this paper proposes two segmentation models with different structures.The main research work of this article is as follows:(1)We study and compare a few of classic algorithms in image semantic segmentation: FCN,DPN,Deep Lab,PSPNet.The main feature of the FCN algorithm is to remove the fully connected layer in the convolutional neural network,and use different depth levels of skip structure to improve segmentation performance.The DPN algorithm designs a high-order cyclic convolutional neural network to extract image features.The Deep Lab algorithm uses atrous convolution to expand the receptive field and enhance the ability to express image features.The PSPNet algorithm is combined with the pyramid pool module for scene analysis.Through the experimental analysis and comparison of the above classic algorithms,this paper studies on the basis of FCN network and PSPNet network for the task of image semantic segmentation.(2)An image semantic segmentation model(U-SEM)based on encoder-decoder structure is proposed.The model consists of an atrous space pyramid pooling module and a channel attention module based on depth separable convolution.The atrous space pyramid pooling module obtains multi-scale information by using an improved 4-layer atrous convolution layer and a global average pooling layer.The channel attention module is a combination of the channel attention mechanism and the deep separable convolution,which is used to calibrate the consistency of the feature information on different channels,and at the same time,the low-level features of the encoding network are transmitted to the decoding network by using skip structure.Low-level image detail information and high-level semantic information are efficiently integrated to continuously improve image segmentation performance.The experimental results in the Pascal voc2012 and Cityscapes show that the U-SEM model enhances the image feature expression ability,optimizes the segmentation edges of objects,and improves the accuracy of segmentation.(3)An image semantic segmentation model(DA-Res2Net)based on multi-scale feature fusion is proposed.This model is mainly composed of three parts: dense feature extraction network,attention module and pyramid pooling module.The dense feature extraction network expresses multi-scale features in a more fine-grained manner,expanding the receptive field range of each network layer and improving the ability of image extraction features.The attention module is normalized by softmax,and the probability distribution values representing the contribution rates of different channels are obtained.The pyramid pooling module is used to fuse multi-scale features of different levels to improve the image segmentation effect.The experimental results in Pascal voc2012 and Cityscapes show that the DA-Res2 Net model can enhance the image context connection ability and feature extraction expression ability,thereby improving the segmentation accuracy of small targets.
Keywords/Search Tags:Semantic segmentation, Encoder-Decoder, Attention mechanism, Features fusion, DA-Res2Net
PDF Full Text Request
Related items