| Semantic segmentation,an important method for scene understanding in computer vision tasks,has shown its powerful application in the fields of as sisted driving,remote sensing image segmentation and medical care.In order to pursue excellent performance,semantic segmentation networks based on deep learning often ignore the hardware resource limitation of network model operation,and more complex st ructures need to occupy a large amount of computational resources,which leads to too slow image processing speed to meet the requirements of actual production tasks.To address the above problems,this paper conducts an in-depth study on the deep learning based real time image semantic segmentation methodology and proposes the following two real time image semantic segmentation models:(1)To address the problem that the semantic segmentation model is complex,computationally intensive and more difficult to migrate to mobile devices,this paper proposes a real-time image semantic segmentation model based on hybrid attention.The model is an asymmetric encoder-decoder structure,where the encoder part combines depth-wise separable convolution and dilation convo lution to design an efficient residual unit to extract feature maps at different network depths,focusing more on spatial location information at the shallow level and enhancing semantic information representation at the deep level.The decoder part designs a hybrid attention feature fusion method,using spatial attention to enhance the spatial location information of shallow features and channel attention to enhance the expression of key information in deep feature maps,so as to effectively fuse the spatia l and contextual information of different levels of feature maps and reduce the loss of image information during the fusion process.On Cityscapes datasets,This model achieves 93.2% PA and 73.2%m Io U.With a parameters value below 1.62 M,it can be up to 38 FPS on Tesla V100 for512×1024 resolution image.On Pascal voc 2012 datasets,This model achieves 74.8%m Io U.The experimental results show that this model can complete the urban scene image segmentation task effectively and in real time.(2)To address the problem of high computational cost of double-branching in the double-branching structure,this paper proposes a double-branching real-time semantic segmentation model combined with dense connected.The model parallels both spatial and semantic branches,where the input image is down-sample in the spatial branch and the spatial information is extracted using a low computational cost convolutional combination.In semantic branching,a dense connected blocks are designed by combining dense connections with non-bottleneck residual structures,which can capture multi-scale contextual information and also reduce the computational effort of the model during feature extraction.and a feature refinement module is constructed in the semantic branch to model the pixel relation ships in the feature map and enhance the representation of semantic information in the feature map.The channel attentionguided feature fusion module is designed to fuse the spatial and contextual information of the two branches to reduce the information l oss during feature fusion.Extensive experiments on the Cityscapes dataset show this model achieves an excellent performance between segmentation accuracy and inference speed,for a 512×1024resolution image,the model achieved 73.5% m Io U on the Cityscapes test set with a speed of 43 FPS on an Tesla V100.On Pascal voc 2012 dataset,this model achieved75.2% m Io U.It is experimentally demonstrated that this model achieves a new balance between segmentation accuracy and segmentation speed. |