Image semantic segmentation can segment and recognize different regions of images by computer.In essence,it is a process of giving different semantic information to each pixel by per-pixel classification.Advances in artificial intelligence technologies have led to the extensive application of image semantic segmentation technologies,for example,medical assistant diagnosis,self-driving vehicles,remote sensing imagery interpretation and so on.This shows that image semantic segmentation is of great value in investigation and practical application.There are some problems in the current semantic segmentation algorithms,such as missing segmentation and false recognition of small-scale targets and discontinuous internal segmentation and unclear boundary segmentation of large-scale targets.Because of such problems,this paper develops an algorithm for semantic segmentation based on encoder-decoder structure.The main research works are summarized as follows:(1)Aiming at the problems of small-scale targets losing and false recognizing in current image semantic segmentation algorithms,we propose an algorithm for image semantic segmentation with multi-scale features fusion and enhancement,which is based on the design idea of keeping high-resolution representation of HRNet.The algorithm builds a multi-scale features fusion and enhancement module at the encoder end of DeeplabV3+,and extracts multi-scale features in a parallel way with two branches;at the decoder end,the idea of skip connection is used to concatenate the feature maps,which is extracted by the multi-scale feature fusion and enhancement module and the original network.The purpose is to make full use of the detail information to optimize the model’s output,thereby improving the model’s ability to segment small-scale targets.On the public datasets of Cityscapes and PASCAL VOC 2012,the proposed algorithm obviously improves the problem of small-scale targets losing and false recognizing.(2)Aiming at the problems of discontinuous internal segmentation and unclear boundary segmentation of large-scale objects,we propose a multi-attention mechanism image semantic segmentation algorithm.According to the characteristics of different layers of feature map,the algorithm constructs a multi-attention module to enhance the expression ability of key location features in the feature map based on DeeplabV3+.Firstly,a location attention module is built at the output end of deep abstract features to enhance the expression ability of detail information in feature graph;secondly,build a channel attention module at the output end of shallow features to activate effective features and help shallow information to be classified and expressed;finally,a global attention module is set up at the decoder end to guide feedforward network transmission by using deep abstract features,so that the network model can recover more spatial details during up-sampling.On the public datasets of cityscapes and PASCAL VOC 2012,the proposed method is nearly 4% and 2% higher than DeeplabV3+respectively,and the segmentation effect is obviously improved.(3)In order to simultaneously solve the segmentation problems of large-scale and small-scale targets,an image semantic segmentation algorithm based on multi-scale and multi-attention mechanism is proposed.At the encoder end of DeeplabV3+,we build the multi-scale fusion and enhancement module and the location attention module at the same time,which help to enhance the ability of model to extract multi-scale features and identify deep abstract features;at the decoder end,the global attention module is constructed,and the expression ability of key features is enhanced by feature weighted aggregation.Compared with other classical semantic segmentation algorithms on the public datasets of Cityscapes and PASCAL VOC 2012,the fusion algorithm is proved to be effective.Finally,the algorithm in this paper can be used for intelligent applications such as automatic driving and indoor and outdoor scene analysis.It can automatically understand and analyze images through computers and capture the contour structure and regional location,which helps to provide technical support for subsequent research tasks. |