BackgroundColorectal polyps is bulging lesions that protrude from the epithelial layer of the colorectal mucosa and mucosal surface into the intestinal lumen,and is a common disease of the gastrointestinal tract,which is closely related to the occurrence of colorectal cancer.Since colorectal cancer develops from colorectal polyps and its process is long,early population screening,early diagnosis and early treatment are especially important to effectively reduce the incidence of colorectal cancer.Tongue diagnosis is an important element of the lookout diagnosis,which is first of the four diagnoses in TCM.It can reflect the body condition rapidly,clearly and objectively,and is an important basis for diagnosis in TCM.In recent years,with the development of artificial intelligence,the obj ectification of tongue diagnosis is often applied in the clinical diagnosis and treatment process because of its advantages of simplicity and accuracy,but there are few studies on the objectification of tongue diagnosis of colorectal polyps.Objective1.It aims at exploring the application effect based on U-Net deep learning network and Kmean clustering machine learning in tongue image segmentation.2.It aims at analyzing the tongue image characteristics related to colorectal polyps and different TCM syndromes,exploring its correlation with the disease and syndromes,as well as providing theoretical basis for better clinical diagnosis and treatment of colorectal polyps after comparing with the healthy control group.3.It aims at realizing the exploratory construction of a prediction model for colorectal polyps,carrying out analysis of risk factors for colorectal polyps,as well as exploring the role of tongue image features in a multi-indicator fusion model.Method1.Consecutive subjects recruited at the Endoscopy Center of Beijing Hospital of Traditional Chinese Medicine,Capital Medical University from March 15,2022 and March 14,2023 were studied and a total of 883 cases were included,containing 685 patients in the colorectal polyp group and 198 cases in the healthy control group.Tongue images of colorectal polyp patients and patients in healthy control group were captured by using a ZMT-1A tongue stethoscope,then those images with a resolution of 576 x 768 were included after data preprocessing and other operations.The dataset was first randomly divided into 8:2 and then the tongue region was manually segmented and labeled by using Labelme software to obtain valid labels.Finally,the tongue segmentation model was built based on the U-Net semantic segmentation network.The tongue segmentation results were evaluated by using MPA and MIoU.The segmented tongue images were saved in RBG and the similar features in the tongue images were extracted by clustering.It further divided the tongue pixels into six pixel clusters in terms of the dimensions of the pixel values of the RGB by using the K-mean clustering algorithm.Moreover,pixel clusters of the same category were pixel-aggregated by using manual classification to obtain the pixel targets of the tongue coating and the tongue body.Finally,the five TCM physicians of the chief position or higher qualifications were asked to identify the tongue coating and the tongue body that had been obtained.The tongue images with qualified coating and body separation were divided into five regions,namely,tongue tip,tongue middle,tongue side left,tongue side right and tongue root according to the theory of tongue and internal organs partitioning in TCM by using OpenCV.Additionally,the tongue image metrics R,G,B,L,a,b,H,S and V in RGB,Lab and HSV color spaces of the overall and partitioned tongue coating and tongue body were acquired.2.Based on the TCM syndrome differentiation criteria of colorectal polyps,830 subjects were classified into spleen deficiency and dampness syndrome,dampness and heat syndrome,wind injury and intestinal complex syndrome,qi stagnation and blood stasis syndrome and spleen and kidney yang deficiency syndrome.The differences in tongue image indexes between the colorectal polyp group and the healthy control group were analyzed statistically to further analyze whether there were differences in tongue image indexes between different TCM syndromes in the colorectal polyp group.3.It explored the risk prediction model of colorectal polyps from machine learning and statistical learning perspectives.From the perspective of machine learning,RF,DT,GBM,SVM and XGBOOST algorithms were used for feature selection and risk prediction model construction.The best features and models were selected by considering the importance of variables and the performance of models.Lasso-Logistic,Ridge-Logistic,Elastic net-Logistic and Stepwise Logistic methods were used to screen the predictor variables and construct the predictive models from the statistical learning perspective.These methods considered the correlation of the predictor variables and the model fitting effect to select the best predictor variables and models.By comparing machine learning and statistical learning methods,more comprehensive and accurate risk prediction models could be obtained.Result1.The evaluation indexes of the U-Net tongue segmentation model had an average pixel accuracy of 98.99%and MIoU was 97.25%;883 K-mean clustered separated tongue texture and tongue moss separation effects were discerned by TCM physicians of the chief position or higher qualifications and the qualification rate of the 883 images was 94%;830 overall tongue texture and tongue moss,as well as the characteristic indexes of the partitioned tongue images were obtained.2.In comparison of tongue image characteristic indexes between colorectal polyp group and healthy control group in partition,L,a,b of tongue color,tongue side right,tongue middle,tongue side left,tongue root of colorectal polyp group were lower than that of healthy control group(P<0.05);L,a,b of coating color,tongue tip,tongue side right,tongue middle,tongue side left and L,b of tongue root of colorectal polyp group were higher than that of healthy control group(P<0.05).In comparison of tongue image characteristic indexes between different TCM syndromes in the colorectal polyp group and the healthy control group,it showed that tongue body a and tongue coating a were lower in the dampness and heat group compared to the healthy control group(P<0.05)and tongue coating R,G,L,H and V were higher(P<0.05).Tongue body a decreased(P<0.05)and tongue body R,G,B,L,H,V increased(P<0.05)In qi stagnation and blood stasis group compared to the healthy control group;tongue body a decreased(P<0.05)and tongue body G,L,H increased(P<0.05)In spleen deficiency and dampness group compared to the healthy control group;tongue body a decreased(P<0.05)and tongue body H increased(P<0.05)In the spleen and kidney yang deficiency group compared to the healthy control group.Tongue body H was higher(P<0.05).3.The features of 27 variables screened in the Elastic net-Logistic and Lasso-Logisitc models were used as basic features in the penalized term-based logistic regression model to build the colorectal polyp risk prediction model.There were various aspects of information included in the model:baseline information(age,gender,BMI),personal history(history of alcohol consumption,history of smoking,regular consumption of preserved foods,regular consumption of smoked and grilled foods,sedentary),disease history(high blood pressure,diabetes,history of helicobacter pylori infection,family history of polyps),test parameters(fecal occult blood,leukocyte count,platelet count and lymphocyte count),as well as tongue image characteristic parameters(tongue texture L,a,b,H,S,V and tongue moss L,a,b,S,V).The risk factors for colorectal polyps were screened by Stepwise Logistic for tongue texture b,tongue texture S,tongue moss L,tongue moss a,fecal occult blood,white blood cell count and age.The classification ability of the model constructed based on common clinical indicators of colorectal polyps was assessed as AUC=0.733.However,when the tongue image metrics were added,the classification ability of the colorectal polyp risk prediction model was enhanced with the AUC increasing to 0.756 accordingly.Conclusion1.Deep learning U-Net and machine learning K-mean clustering algorithms can be used for objective extraction of tongue images.2.The correlation between tongue image characteristics and colorectal polyps and the differences between different TCM syndrome types of colorectal polyp patients can provide a quantitative basis for the clinical diagnosis and treatment of colorectal polyps.3.The risk factors for colorectal polyps include tongue image characteristics(TB-b,TBS,TC-L,TC-a),age,white blood cell count and fecal occult blood.It is feasible to construct a colorectal polyp prediction model based on the fusion of tongue image features with multiple indicators,moreover,such model provides a certain basis for the construction of colorectal polyp prediction model with the fusion of multiple features in Chinese and Western medicine in the future. |