The mortality rate of esophageal cancer is extremely high,but the 5-year survival rate after early cancer surgery can reach more than 90%,and early detection,early diagnosis,and early treatment are important means to reduce esophageal cancer and its mortality.However,nowadays,there are few studies on the intelligent recognition technology of esophageal precancerous lesions,and there is no corresponding data set available,to study the intelligent recognition technology of esophageal precancerous lesions,the following research work is mainly carried out:(1)At present,there are few studies on esophageal precancerous lesion segmentation in the field of deep learning,some studies are still in their infancy,the current accuracy is low,and the relevant data is relatively private and not easy to obtain,resulting in the slow development of the application of deep learning in this field.To solve this problem,this project constructs a dataset of esophageal precancerous lesions,which includes four types of precancerous lesions: inflammation,hypotumor,hyper oma,and early cancer,and through continuous data expansion,the existing data volume reaches 1690 finely labeled images,filling the gap in this field of the industry.(2)Aiming at the problem that the characteristics of multiple types of lesion areas in the esophageal precancerous lesion dataset are small,the individual differences are large,and it is difficult to achieve high-precision segmentation.Using the self-attention mechanism,long-distance dependent information can be extracted and discriminative features can be obtained,but the computational overhead is large.Therefore,a Globally Correlated BlockLevel Self-Attention(GC-BLSA)method for regional segmentation of esophageal precancerous lesions is proposed.Firstly,Block-Level Self-Attention(BLSA)is used to apply the self-attention mechanism on multiple feature blocks to reduce the number of network parameters and computations.Secondly,the Block Correlation Mechanism(BCM)is used to model the relationship between each feature block and the entire feature map,which solves the problem that each feature block alone cannot extract the long-distance dependent information related to the global by using self-attention.Finally,the relative position offset is introduced in the block-level self-attention module to compensate for the position information lost due to blocking,and effectively improve the network segmentation accuracy.The experimental results showed that the segmentation indicators m Io U and F1-Score on the tetra category esophageal cancer dataset reached 50.21% and 63.79%,which were 3.74% and 4.30% higher than the traditional self-attention non-local module,respectively,and the number of parameters decreased by 26.38%,and the calculation amount decreased by 10.62%,which was better than other mainstream esophageal cancer segmentation methods.(3)In the dataset of esophageal precancerous lesions,the lesion areas of various lesions varied greatly.Aiming at the problem that the traditional convolutional neural network(CNN)cannot extract the information of the multi-scale receptive field and the weak ability of the Transformer to extract local features,an Information Enhanced Multi-Scale Transformer(IEMSFormer)esophageal precancerous lesion segmentation network integrating the CNN and Transformer is proposed,and the encoder of the network is composed of CNN and Transformer.Firstly,in the CNN part,the Multi-scale receptive field Transformer(Mrf TR)module is designed to extract the multi-scale receptive field for lesions of different scales,this module is based on the Transformer structure,and the MRFTR module is stacked to form the CNN part,and the output of the CNN part is used as the input of the Transformer part to extract the discriminative characteristics of the lesion area of each scale.Secondly,a Multi-scale Enhanced Information Fusion(MEIF)module is designed to fuse the characteristics of adjacent three-scale encoders and extract multi-scale detail information to make up for the single detail information of the decoder.Finally,the Migration Pooled Downsampling(MPD)module is designed to reduce the loss of detailed information in the traditional downsampling link and embed it in the MEIF module.In addition,the number of module layers with more model stacks is different,and three networks of different sizes are designed: IE-MSFormer-S,IE-MSFormer-M,and IE-MSFormer-L.The experimental results show that the more of the large network of this method on the esophageal precancerous lesion dataset reaches 57.42%,and the accuracy(Pr),specificity(Sp)and Dice Similarity Coefficient(DSC)reach 78.72%,93.06%,and 70.50%,respectively,and the m Io U increases by 4.46%,and the Pr,Sp,and DSC increase by 6.95%,0.27%,and 4.43%,respectively,which is better than other mainstream segmentation networks. |