| With the development of deep learning in the field of computer vision,the semantic segmentation algorithms based on deep convolutional neural network(CNN)have made great progress.Semantic segmentation has been widely used in medical image processing,geographic information system,autonomous driving,robot vision,etc.This technology has attracted more and more attention from scholars.The task of semantic segmentation is to detect and label the target object in the image,depict the boundary of each object,and finally get a segmentation image with pixel-level semantic annotation.The image semantic segmentation technology mainly includes the traditional feature extraction methods and the deep learning methods.The deep learning-based semantic segmentation methods mainly have two difficulties:(1)semantic segmentation not only needs to achieve pixel classification,but also needs to achieve the positioning of the foreground target.Due to the inherent translation invariance of the convolutional neural network and its insensitivity to space,it is difficult for the network to locate the foreground target.(2)continuous down-sampling operation in the convolutional neural network makes the resolution of the output feature map smaller and smaller,and the information is seriously lost,resulting in inaccurate pixel classification.In view of the above problems,this paper mainly does the following work:1.In order to make the location of foreground target more accurate,Convolutional Block Attention Module(CBAM)is introduced into the existing Deep Lab V3+ network,and a attention-deeplab segmentation network is proposed.CBAM module learns channel attention and spatial attention of feature map respectively.Channel attention can emphasize important details in each feature map,and spatial attention can better help network locate the location of the target.Attention-deeplab achieves finer segmentation effect by introducing the attention-mechanism module.2.In order to achieve more accurate pixel classification,this paper proposes a margin cosine loss function,which introduces cosine margin for each category.Existing semantic segmentation models usually use traditional softmax loss as a loss function to train the network,while softmax loss only focuses on the distances between classes and ignores theintra-variance.Compared with traditional softmax loss,the cosine loss function based on margin not only optimizes the distance between classes,but also optimizes the distance within classes,making the decision boundary between different classes more robust.3.This paper conducted sufficient experiments on some public benchmark data sets(PASCAL VOC 2012,Cityscapes,ADE20K)to analyze the effect of each module,and the experimental results demonstrate that the method proposed in this paper has achieved better performance. |