Colorectal cancer is a common malignant tumor of the digestive tract.Its incidence and mortality rates rank third and second among all cancers,respectively.It has become a major public health problem endangering human health.Colorectal Cancer Histopathological Image(CCHI)is an important basis for the clinical diagnosis of colorectal cancer.However,traditional pathological diagnosis conclusions rely too much on the personal experience of doctors and lack stability and consistency.In recent years,deep learning technology has achieved many important results in the field of medical image analysis.The application of deep neural networks to automatically identify different tissue types in CCHI can quantify the Tumor microenvironment,which is of great significance for the clinical diagnosis of colorectal cancer.However,two things limit the application of deep learning technology in CCHI analysis.One is that training deep neural networks requires a large number of annotated samples.The other is that annotating CCHI samples requires professional pathologists to spend a lot of time.Applying active learning technology to CCHI classification to reduce the need for annotated samples in deep neural networks can alleviate the above problems to a certain extent,which is highly meaningful research work.Considering the complexity of colorectal cancer tissue and its pathological images,the application of active learning techniques for CCHI classification still faces the following challenges:(1)The diversity and complexity of pathological images caused by the heterogeneity of colorectal cancer pathological tissues make it difficult to select training samples in active learning;(2)Due to the influence of processes such as staining and sampling,there may be differences in the color,brightness,and contrast of pathological images of the same type of colorectal cancer tissue,resulting in distorted measurement of sample information in active learning;(3)There are a large number of unlabeled samples in CCHI classification based on active learning,and their value has not been fully explored and utilized.Therefore,this article proposes a CCHI classification method based on active learning,which measures and selects high-value samples from the perspectives of diversity and information content.Through model prediction and denoising learning,high-quality pseudo-labeled samples are obtained,achieving precise classification of CCHI images under annotation cost constraints.The main contributions of this article include:(1)To address the issue of difficulty in selecting training samples in active learning due to the diversity and complexity of CCHI images,a diversity-based sample selection strategy is proposed to evaluate and screen high-quality samples by calculating the feature differences between samples.(2)To address the issue of "differences in color,brightness,and contrast of the same type of CCHI image,resulting in distorted measurement of sample information in active learning",a sample selection method based on information is proposed,which filters high-quality samples by optimizing the sample information metric.(3)Aiming at the problem that a large number of unlabeled samples in activelearning-based CCHI classification cannot be used properly,a Semi-Supervised Learning mechanism is introduced based on the existing active learning framework,and high-quality pseudo-labeled samples are obtained through model prediction and noise elimination learning to expand the size of labeled samples..The experimental results on CCHI classification show that compared with existing active learning methods such as Core-set and VAAL,the active learning method designed in this paper has higher classification accuracy,stronger generalization ability,and more practicality under the same conditions of basic network and labeled sample size. |