At present,cancer has become an important factor in the death of residents in my country and the world,and breast cancer is an important disease that leads to the death of women.Using a microscope to observe the tumor area of the pathological section for symptom analysis,and calculating the positive index of tumor cells can provide patients with a targeted treatment plan.Due to the limited number of professional pathologists and the personal subjective factors of doctors,there are often problems such as inaccurate identification of various tissue regions and large deviations in the estimation of positive index.On the one hand,the tissue segmentation of the image data of pathological slices through deep learning can help pathologists quickly locate different tissue areas,which is convenient for symptom analysis and positive index estimation;On the other hand,the common pathological diagnosis auxiliary system is to calculate the positive index through cell segmentation based on image segmentation,and good tissue segmentation can performanceively improve the accuracy of cell segmentation.However,using deep learning methods for image segmentation requires a large amount of labeled data.In the medical field,labeling data is expensive and timeconsuming due to the variety of data samples and the difficulty of labeling.Therefore,it is of high practical significance to use semi-supervised learning to replace fully supervised learning to train the model.It can achieve the performance of not weaker than full annotation in the case of a small amount of labeled data and a large amount of unlabeled data.In view of the above problems,this paper uses the HE staining breast cancer dataset,and uses semi-supervised learning method to study the semantic segmentation of breast cancer pathological sections.The paper has completed the following work:(1)In order to explore the segmentation performance of different segmentation networks and different proportion of labeled data for pathological images,this paper selects three networks of Link Net,Unet and Trans UNet to conduct experiments on the breast cancer dataset respectively.The experimental results show that Trans UNet has the best segmentation performance under a small amount of data,and the three have the same performance under the sufficient data,Dice_avg has reached 0.72,and the more labeled data,the better the performance of the three segmented networks.(2)In order to solve the problem of difficult labeling of pathological image data,this paper migrates several kinds of methods of semi-supervised classification to the problem of segmentation,and comprehensively compares the performances of selftraining,consistent regularization and hybrid methods in breast tissue segmentation.The experimental results show that the performance of the hybrid method is better than the other two methods,Dice_avg reaches 0.69 under 1 / 10 of the labeled data;at the same time,the post-processing method and optimization function of pseudo-labels are studied in the self-training method,and three different image noises are studied in the consistency regularization.The experimental results show that the hard labels are more suitable for the post-processing of pseudo-labels,the cross entropy function is better in the segmentation problem,and cutmix noise is more suitable for pathological images than gaussian noise and interpolation consistent noise.(3)This paper proposes a hybrid algorithm based on pseudo label and consistency regularization to improve the performance of semi-supervised segmentation of the same scale labeled data.The algorithm selects the pseudo labels generated by selftraining through intersection union ratio sorting,and introduces the minimum entropy constraint into the consistent regularization algorithm.The experimental results show that the screened pseudo labels can further improve the segmentation performance,and Dice_avg can increase by 1.3% under 1/10 of the labeled data,achieving the same performance as the industry’s best algorithm CPS,and has better performance than CPS algorithm on other proportions of data.In the 5/10 scale of the labeled data,Dice_avg reaches 0.721,which is the same as the performance of labeling all the data,surpassing all other semi-supervised segmentation algorithms.The semi-supervised segmentation algorithm proposed in this paper has shown good results in the segmentation of breast tissue,can solve the problem of data annotation to a certain extent,and can be transferred to other segmentation tasks. |