| The prevalence of ophthalmic diseases such as glaucoma,macular degeneration,fundus exudates,and diabetes is increasing year by year,and early detection and treatment of these diseases can effectively ensure patients’ quality of life.The development of some diseases in the human body can lead to changes in the retinal blood vessels and manifest lesional features in the fundus.Changes in retinal vascular characteristics are also often used by physicians as an essential indicator for the diagnosis of cardiovascular disease.Therefore,accurate medical image segmentation of retinal fundus vessels is crucial to assist physicians in diagnosis and treatment.In recent years,convolutional neural networks have been widely used in retinal vascular pixel classification and vascular segmentation tasks.Although there have been some effective research results in improving the efficiency of real-time retinal vessel segmentation and vessel segmentation accuracy,there are still some problems:(1)Most existing methods have incomplete extraction of shallow detail features or loss of partial feature information,making it difficult to segment capillaries located at the edge of the image accurately.In addition,the distribution of the fundus vessel tree is generally asymmetric,and the diameters of vascular arterioles and capillaries vary greatly,which is also difficult to be segmented simultaneously by some methods.(2)Although convolutional neural networks have the advantage of extracting local feature information from images,the convolutional block perceptual field is limited,and simple multiple superpositions can easily cause information loss,which has limitations in feature extraction and vessel segmentation.(3)Although Transformer performs well in modeling long-distance dependencies,local information is also important for retinal fundus vessel segmentation.It is difficult to ensure accurate vessel segmentation by using Transformer alone.This paper investigates the retinal image vessel segmentation method based on convolutional neural network with Transformer to address the above problems.The main contents of the paper research are as follows:(1)A multi-scale retinal vessel segmentation network is proposed based on skip connection information enhancement(SCFE_Net).First,retinal images at multiple scales are used as input to enable the network to capture features at different scales.Secondly,a feature aggregation module is proposed to aggregate the information of shallow networks to provide richer information to the network by using shallow features.Finally,the skip-connected information enhancement module is proposed to fuse the detailed features of the shallow network and the high-level features of the deep network to avoid the problem of incomplete information interaction between network layers.(2)Based on SCFE_Net,we start from the perspective of extracting the connection between local detail features and making a complementary use of longdistance dependency information.The Transformer mechanism is introduced and a network model(Multi-scale Transformer-Position Attention Network,MTPA_Unet)with combined convolutional neural network and Transformer is designed to be applied to the retinal vessel segmentation task.By combining the Transformer with the convolutional neural network in a serial manner,the proposed TransformerPosition Attention module captures long-range dependencies and focuses on the position information of the vascular pixels,thus helping MTPA_Unet to achieve a finer segmentation of capillaries.(3)The performance of the proposed network model is evaluated and analyzed on three publicly available retinal image datasets,DRIVE,CHASE,and STARE,respectively.By training and testing on the three datasets,the SCFE_Net achieves 97.01%,97.67%,and 97.68% in accuracy,98.32%,98.65%,and 98.53% in specificity,and 83.00%,81.81%,and 85.05% in Dice,respectively.Compared with the suboptimal model,SCFE_Net improves the accuracy of retinal vessel image segmentation.The accuracy of MTPA_Unet reaches 97.18%,97.62%,and 97.73%;the specificity reaches 98.36%,98.58%,and 98.41%;the Dice reaches 83.18%,81.64%,and 85.57%.The MTPA_Unet could not only segment the complete retinal vessel tree but also further improve the segmentation accuracy of the delicate endings of the marginal vessels. |