| Liver cancer is a type of cancer with a high mortality rate,and liver cancer has a high incidence in China,which own half of the world’s liver cancer patients.Prognostic analysis is performed before the operation of liver cancer.The pathologist observes the pathological image of liver cancer tissues and judges the patient’s prognosis in combination with the patient’s condition,thereby formulating a targeted treatment plan.Due to the complex structure and rich information of pathological images of liver cancer,and the scarcity of pathologists,medical resources are strained.Using machine learning to assist in prognostic analysis can reduce the burden on pathologists.Pathologists can focus more on the treatment of patients,thereby improving the cure rate of liver cancer.In this study,image processing and calculation methods were used to extract quantitative features of liver cancer pathological patch,classification and survival risk prediction were based on these features.Finally,survival risk was used to stage cancer and mine important features related to prognosis.Since the original image WSI pixel amount can reach tens of billions,the liver cancer pathology WSI is first segmented in the study to obtain a liver cancer pathology patch with pixels of 256~*256.After image preprocessing of the liver cancer pathology patch,the constructed CellProfiler feature extraction pipeline is used to calculate the quantitative features of the patch.Then use the XGBoost classification model to classify the patch into cancer and non-cancer based on the extracted features.Based on the classification results,sampled 36 representative cancer region patches from each WSI and used these patches to predict the survival risk of liver cancer patients.Survival risk is predicted using the combined model of XGBoost and Cox.The patient’s survival risk is the predicted mean of the sampled patch.Finally,the median survival risk is used to group the liver cancer patients,and then the TNM segmentation period and survival curve were drawn according to the group to verify the performance of the prognostic model.The data of the classification model and the prognostic model in the experiment are both quantitative features extracted by CellProfiler,and the purpose of adding the classification model is to select the patch of the cancer category for extracting features.In order to verify the effectiveness of the quantitative features extracted by CellProfiler,LBP algorithm and GLCM algorithm are also used to extract pathological patch features,and the features are used to train the same classification model and compare the accuracy of classification models.In addition,the experiment explored the impact of different magnifications and different patch category ratios on the prognostic model.Liver cancer pathological images have two magnifications of 10 times and 40 times.Different category ratios refer to the proportion of cancer type and non-cancer type in 36 patches cut from a single WSI.The optimal classification model is XGBoost,using random forest and SVM for comparison,and the prognosis model is a combination of XGBoost and Cox model.Combining XGBoost and Cox model can make full use of a large amount of censored data in the survival data,thereby improving the concordance index of predicted survival risk.The train and test set data of this study came from a hospital affiliated to Shanghai Fudan University,and the validation set data was taken from the TCGA(The Cancer Genome Atlas)liver cancer data set.After a lot of experiments,728 quantitative features were extracted from each patch.The optimal classification model is XGBoost,and the accuracy rate of the verification set is up to 86.9%.The prognosis model is a combination of XGBoost and Cox regression models,and the survival risk concordance index of predicted survival risk reached 0.67.The features of the classification model and prognosis model are extracted by CellProfiler,and the optimal model data input is a x40 magnification liver cancer pathology patch.Through experimental comparison,it is found that the effect of extracting features using x40 magnification image is better than that of x10 magnification image.The patch of cancer area affects the prognosis result more than the patch of non-cancer area.In this paper,the performance of the prognostic model is verified by TNM segmentation and Kaplan-Meier analysis.The experimental results show that the prognostic analysis of liver cancer based on machine learning is of medical significance. |