Image denoising and restoration is the most fundamental part of image processing,and it is also a hot research topic that receives the most attention.Therefore,it has more important practical significance in the field of medical images.Due to the complexity of human tissue and the imperfect medical imaging equipment,the medical image information collected by relevant equipment often has a low signal-to-noise ratio,which affects the diagnosis of diseases.Computed Tomography(CT)is of great significance in modern medical diagnosis,and using high-dose scans can better make CT images clear and visible.However,larger dose scans often result in greater radiation,which can cause certain harm to patients and even induce cancer.However,reducing radiation dose inevitably introduces noise,Noise can interfere with doctors’ diagnosis of the condition.Therefore,in the field of medical image denoising,it has attracted widespread attention from researchers and gradually become one of the research focuses.Convolutional neural networks are widely used in medical image denoising models.They can effectively process data with spatial local correlations and use convolutional operations to extract local features from images.However,it has limitations in capturing long-distance dependencies and cannot effectively consider the overall relationship of the image.This article innovatively uses the Transformer framework model to denoise and learn low-dose CT medical images,utilizing low-dose image information for feature extraction and noise distribution learning to restore NDCT images.The main work is as follows:(1)In this paper,the Transformer framework is introduced into medical image denoising to build an Encoder-Decoder model with residual structure.The network is a U-shaped hierarchical structure composed of encoder and decoder.The Encoder and Decoder modules are composed of multiple Le Swin Transformer Blocks,cross model fusion modules,and downsampling layers.The use of Le Swin Transformer Block utilizes attention mechanism to capture long-distance dependencies,effectively capturing local contextual content.At the same time,the use of shift window mechanism reduces the computational burden between windows and strengthens the problem of information transmission between windows.This model achieved excellent noise reduction performance and the best sensory effect in the experiment.(2)In response to the shortcomings of Transformer’s excessive focus on global information,lack of attention to local features,and lack of attention to texture details of the image when restoring the overall model.This article proposes a cross model fusion mechanism,and proposes a U-Net module that can guide auxiliary supervised training based on the structural characteristics of the U-shaped Transformer.By combining convolutional neural network model feature parameters with the Transformer,the convolutional neural network provides low-level visual features to compensate for texture details,and even attempts to speed up the Transformer’s training process by providing pre training parameters through U-Net.Introducing a supervised attention module between the two models that helps improve performance gain,the auxiliary module generates an attention map using image restoration of useful real noise information and local supervised prediction to suppress features with low information content in the current stage,allowing only useful features to propagate to the main model.Subsequently,a new feature extraction module is introduced: a learnable Sobel convolution edge enhancement module,which differs from traditional Sobel operators in that the learnable parameters contribute to edge enhancement.This parameter is adaptively optimized and adjusted during the training process to extract edge information of different intensities.A large number of validation experiments were carried out in the AAPM Mayo open simulation dataset and the Piglet real dataset.The ablation experiments that changed the structure of the model itself were conducted to explore the role of each module and verify the performance of the model.The experimental results show that the proposed model can further improve the low-dose CT image quality compared with the traditional classical model,improve the PSNR and SSIM,and have better noise reduction quality in visual sense. |