| In recent years,with the acceleration of urbanization in China,the scale of highway construction has continued to expand,and the mileage of highways has rapidly increased.Both existing and newly constructed highways require maintenance,and road cracks,as the most common type of pavement damage,are the basis for relevant departments to conduct road maintenance and make scientific decisions.In view of the characteristics of fuzzy crack boundaries in road crack images,low contrast with the surrounding environment,and complex topological structures,combined with the existing majority of road crack image semantic segmentation methods,there are still major shortcomings in obtaining receptive fields and extracting road crack image feature information.This paper fully explores the semantic segmentation of road cracks in images using convolutional neural networks in deep learning.Inspired by the U-Net neural network model,this paper proposes two convolutional neural networks based on improved U-Net,namely ASARU-Net and DSEAU-Net.In order to obtain a larger receptive field and improve the ability to capture image feature information,the improvement method proposed in this paper for ASARU-Net is as follows: a nested neural network structure is proposed and applied in the downsampling process of traditional U-Net,increasing the depth of the encoder to achieve the expansion of the receptive field and improve segmentation performance.At the same time,two residual convolution modules based on the Squeeze-and-Excitation(SE)module are proposed in this paper,which are used as attention mechanisms in the skip connections and upsampling process of traditional U-Net,avoiding the problem of gradient vanishing or explosion caused by the increase in the depth of the neural network model.In response to the problem of the increased number of learning parameters and decreased training efficiency of the neural network model due to the introduction of new modules and connection layers in ASARU-Net,this paper proposes another neural network model named DSEAU-Net.The training efficiency of the neural network model is improved by adding a dense module to the last layer of the downsampling layer and adding an attention mechanism based on Conv LSTM in the skip connection.Meanwhile,the method of increasing the depth of the encoder is still used to improve the accuracy of the model training,and a new spatial compression activation module(SE)is proposed and applied to the last layer of each downsampling layer.Finally,in the upsampling process,compared with the traditional U-Net with only one upsampling path,a new encoder-decoder structure with two upsampling paths is proposed.Compared with the traditional U-Net model,the additional upsampling path can reduce the semantic gap between the downsampling and upsampling paths in the encodingdecoding process,thereby improving the accuracy of the model training.Finally,experiments were conducted to verify the proposed neural network models using Dice coefficient,Jaccard coefficient,and accuracy as evaluation metrics.Both quantitative and qualitative analyses were performed on the two models.Experimental results on the overall structure of the two neural network models show that compared with other classical neural network models,the two proposed models have better performance in road crack image semantic segmentation.In addition,ablation experiments were conducted on the two neural network models,which indicate that the proposed neural network modules are essential for improving the performance of the neural network models in road crack image semantic segmentation.Finally,to better suit road crack semantic segmentation,a hybrid loss function based on binary crossentropy and Jaccard coefficient was used.Experimental results show that compared with other traditional loss functions,this loss function is more suitable for semantic segmentation of road crack images. |