Medical image segmentation methods based on multi-modal data can improve segmentation accuracy by fusing multi-modal features and have been widely used in the field of smart medical service.However,under real medical conditions,part of the modal data is often missing due to many reasons such as insufficient medical equipment,high cost and scan time of obtaining various modal data,and the patient’s physical condition,etc.In the absence of partial modal data or with single modal data,the segmentation accuracy of the existing algorithm is significantly reduced.Aiming at the missing modalities problem,we propose a medical image segmentation method based on generalized knowledge distillation,which uses the idea of knowledge distillation to transfer the knowledge of a welltrained multi-modal segmentation network to a mono-modal one,thereby improving segmentation results of the mono-modality segmentation network.The framework proposed in this paper consists of two branches:teacher network and student network,both of which are based on the nnUNet,but the input of the teacher is multi-modal data while mono-modal data for the student.In the process of training the student network,in addition to using the reference segmentation label,this paper uses the softened output of the teacher as a new label to guide the student.Based on the hypothesis that the deep network layer contains high-order semantic information,we also restrict the similarity of the bottleneck features of the student and teacher network and encourage student to learn high-order semantic representation from the teacher.In addition,we also propose to use channel and spatial attention modules to allow student to focus on learning teacher and improve the efficiency of multi-modal to mono-modal knowledge distillation.This paper conducts comparative experiments on the multi-modal medical image data set.The segmentation results of the student network after knowledge distillation are better than the baseline model and the current best models U-HVED and U-HeMIS,which proves the effectiveness of the proposed method.At the same time,more experiments show that this method works generally with different data sets,different network structures,and different number and types of modalities. |