| In the era of rapid development of the artificial intelligence industry,autonomous driving technology is getting closer to people’s lives.Autonomous driving technology based on traditional algorithms first collects data from the surrounding environment through various sensors,then conducts data analysis through traditional algorithms,and finally makes decisions to control the vehicle.Therefore,traditional algorithms have disadvantages such as low efficiency,inability to execute end-to-end,and low accuracy.The deep learning-based autonomous driving technology first collects surrounding environment data through the camera,and then uses deep learning algorithms to perform end-to-end feature extraction,image segmentation,and vehicle decision-making,which increases processing speed and greatly improves accuracy.Compared with expensive lidar sensors,the pictures collected by inexpensive cameras can greatly reduce costs and further promote the implementation of autonomous driving technology.In order to ensure the safety of vehicles,autonomous driving technology has higher accuracy requirements for the surrounding environment.Image semantic segmentation plays an important role in automatic driving.Using the results of image semantic segmentation can make the judgment of the vehicle’s drivable area more accurate,and the object category and shape judgment more accurate.Nowadays,the main scene in the field of automatic driving is the urban scene.Therefore,semantic segmentation of urban scenes is an important field.In the few years of rapid development of deep learning,there are many end-to-end image semantic segmentation networks like FCN and Deep Lab.This paper mainly uses the Cityscapes dataset on the Py Torch framework to study the scene semantic segmentation method from the following three aspects.1.Scene semantic segmentation combined with dual attention mechanism.Attention information is extracted from the channel and space dimensions of the feature map.The channel attention mechanism can get the weights of different channels in the feature map,and the spatial attention mechanism can get the weights of different positions in the feature map.In this paper,the dual attention mechanism module is embedded in the backbone network in series and in parallel to improve the target segmentation accuracy of the model.2.Scene semantic segmentation incorporating multi-scale adaptive attention mechanism.This paper proposes to integrate the multi-scale adaptive module with the attention mechanism,where the multi-scale adaptive module can adjust the size of the receptive field adaptively according to the input information,avoiding the problem of small target information loss when using fixed-size hole convolution.The multi-scale adaptive module fused with attention mechanism can give different weights to targets of different sizes to re-calibrate the feature maps,and improve the semantic segmentation performance of small targets.3.Multi-task scene semantic segmentation combining attention mechanism and edge detection.This paper uses multi-task learning to share features and prevents over-fitting.This paper proposes a multi-task-based dual-stream network structure that combines edge detection and a semantic segmentation network with a dual attention mechanism.The edge detection sub-network can be obtained from the input image The edge of the image,extract the edge information of the object of the small target.The dual-stream network structure can not only improve the accuracy of small targets but is also effective for large targets,optimizing the segmentation effect of the semantic segmentation network. |