In the entire lifecycle of subway construction,operation and maintenance,tunnel defects,especially cracks and water leakages,are common and seriously affect the normal operation of the subway,and even lead to serious safety accidents.The visual sensors used for defect detection in subway tunnels mainly include visible light sensors and infrared sensors.Visible light sensors have the advantages of high resolution and rich detail information in their images,while infrared sensors can distinguish background targets based on different thermal radiation and are not affected by the low-light environment.The paper focuses on the rapid and accurate detection of tunnel defects through the fusion of visible light and infrared images.The main research focuses on the high-quality fusion of the similarity and difference in defect image features,the automatic extraction of high-precision pixel-level labels,and the accurate semantic segmentation to address the complex subway tunnel environment.An efficient visual detection approach for tunnel defects is constructed.The main research contents and innovative points of the paper are as follows.(1)To address the issue of the single visual sensors being susceptible to environmental influences,a method is proposed to fuse visible light and infrared images based on feature similarity and difference.Firstly,a multi-scale retinal enhancement algorithm with color restoration(CWIE)is used for image enhancement,which improves the problem of uneven illumination or shadow in the disease area in visible light images.Secondly,a fusion network called STD-GAN is constructed using Res Net blocks.The network adopts a generative adversarial network(GAN)framework composed of a generator and a discriminator.The generator receives the enhanced visible light and infrared images as inputs and uses t the structural similarity function(SSIM)and the sum of the correlations of differences function(SCD)to preserve the details of the source images as much as possible.The discriminator is used to determine the style of the final fusion image,and through feature learning,the style of the fusion image tends to be biased towards the visible light image,which is more in line with human perception.Experimental results show that using STD-GAN for the fusion of cracks and water leakages can effectively preserve the details of the source images compared to other fusion methods.In addition,by comparing evaluation indicators,STD-GAN achieves the best values on five indicators: EN,QE,SF,VIFF,and MI,indicating that it can generate high-quality fusion images.(2)To address the problem of requiring a large number of pixel-level labels for training the semantic segmentation network,a weakly supervised automatic label extraction method called ILD is proposed.The method consists of two steps: image-level label extraction and CAM label generation.The image-level label extraction uses different feature extraction algorithms to extract features of cracks and water leakages,which can preserve disease areas while removing some background information.The CAM labels are generated through feature learning and generation using the improved S-Res2 Net,producing high-precision pixel-level labels.Through experimental comparison,ILD outperforms other weakly supervised methods,such as SC-CAM,in various metrics.Specifically,ILD achieves 0.041,12.7%,11.3%,7.8%,and 10.3% higher performance in Loss,Recall,Precision,MIo U,and F1,respectively,while being more time-efficient.(3)To address the issue of tunnel defect segmentation incompletly of semantic segmentation networks in complex environments,a fully supervised semantic segmentation network called S-Net is proposed to tackle the problem.S-Net uses S-Res2Net101 as the backbone network and incorporates channel attention mechanisms and spatial attention mechanisms.Through experimental comparisons,it was found that compared to other fully supervised networks,including HRNet,PSPNet,U-Net,and Deep Lab V3+,S-Net can more completely segment leaking water and cracks,proving better generalization ability.Based on performance evaluation,S-Net outperforms the second-best network by 0.131,1.8%,3.9%,3.6%,and 3.3% in Loss,Precision,Recall,F1,and MIo U,respectively.The multi-sensor image fusion visual detection approach for tunnel disease can quickly and accurately detect the diseased area in the face of complex tunnel environments.The approach combines the advantages of infrared sensors and visible light sensors,and can generate fusion images with more detailed information.It is convenient to distinguish tunnels and diseased areas,improve the accuracy of disease detection,and reduce the missed detection rate.Automatic pixel-level label extraction can be performed on fusion images,which not only improves efficiency but also ensures precision. |