| Malignant tumor(cancer)is a major disease that can occur in any part of the body,seriously endangering human life and health.Histopathological diagnosis is a key factor affecting cancer treatment and prognosis.With the maturity of whole slide image(WSI)scanning technology,pathological diagnosis has gradually changed from microscopic diagnosis to manual reading diagnosis.Due to the complexity of pathological images and the subjectivity of pathologists’ diagnosis,manual diagnosis of WSI is time-consuming and error-prone.It is of great practical significance for reducing the workload of pathologists and improving the efficiency and consistency of diagnosis to realize automatic assisted diagnosis of WSI based on artificial intelligence technology.In recent years,the continuous breakthrough of deep learning technology has greatly promoted the development of computational pathology.However,due to the highly specialized and time-consuming annotation of pathological data,there is a lack of large-scale annotated data,which restricts the further application of deep learning technology in computational pathology.Based on this,this paper firstly takes the research on the WSI binary classification algorithm as an entry point,and then tries to introduce the weak annotation learning strategy to reduce the algorithm’s dependence on large-scale labeled data.After that,from the perspective of self-supervised learning,this paper further explores a more complex WSI multi-classification algorithm in the scenario of the lack of labeled data.This article takes common cervical epithelial lesions as application scenarios,and the main contents are as follows:For the binary classification task of the WSI,this paper proposes a two-stage algorithm based on the weak annotation that integrates traditional machine learning and deep learning methods and analyze the practicability of the algorithm.The algorithm includes two stages of lesion area detection and the WSI classification.In the lesion area detection stage,we first divide the WSI into several tissue patches,then train a deep learning classification network based on the patches,and finally generate the heatmap representing the detection results of the lesion area by merging the prediction results of these patches.In the WSI classification stage,we first extract morphological feature information from the heatmap,and then use traditional machine learning methods to complete the classification task of the WSI.This paper also proposes 1)a weak annotation strategy to reduce the dependence on large-scale data annotation;2)a patched overlap sampling strategy based on mask images to improve the accuracy of lesion area detection;3)a fully convolutional classification network to improve prediction speed.Our algorithm achieves good classification results,but its practicability is limited:1)Weak labeling strategy can reduce the model’s dependence on large-scale labeling data to a certain extent,but the performance of this supervised algorithm is still limited due to the scale and quality of the labeled data,the algorithm is not well suited for the task of lesion area detection in scenarios where labeled data is scarce;2)The WSI classification algorithm based on traditional machine learning relies on complex and tedious feature engineering and does not facilitate the expansion of the method.To better cope with the lack of labeled data and avoid tedious heatmap feature engineering,this paper proposes a two-stage classification algorithm based on selfsupervised learning for the WSI multi-classification task.To take full advantage of the information contained in a large amount of unlabeled data,in the first stage,we build a pre-trained Visual Transformer(ViT)model based on self-supervised learning for the task of pathological slice analysis.First,we sample large-scale patches from a large number of unlabeled WSI.And then use these unlabeled patches to pre-train the ViT model based on a generative self-supervised learning architecture based on masked autoencoders.Finally,the pre-trained model is fine-tuned with a small number of labeled patches in downstream tasks to achieve the classification of patches.In the second stage,we propose a deep learning method for the WSI classification task.We first generate a heatmap of the lesion area based on the block data overlapping sampling strategy of mask images,then compress the heatmap in the spatial dimension and realize the information fusion of the heatmap in the channel dimension,and finally fuse the spatial attention and channel based on a custom attentional deep learning network SCANet implement the classification of WSI.Our SCANet model can avoid the tedious feature extraction process and effectively improve the performance of classification tasks.Our proposed two-stage algorithm based on self-supervised deep learning achieves 87.14%accuracy on the cervical epithelial lesion patches classification task,86.21%accuracy,and 95.56%Custom performance metrics on the WSI multi-classification task.This demonstrates the great application potential of self-supervised learning methods in scenarios where labeled data is scarce,as well as the superiority of WSI multiclassification algorithms with two stages of deep learning. |