| Objective To analyze the prevalence of type 2 diabetic retinopathy(DR),use machine learning algorithm and logistic regression to reveal the influencing factors of DR,build a DR clinical prediction model and evaluate it.Methods 5148 outpatients and inpatients with type 2 diabetes mellitus from March 2016 to December 2020 in Shijiazhuang Second Hospital were selected as the research objects,and the data of basic information,physical examination and laboratory test were collected.Patients were randomly divided into the training set and the test set at 7:3,and 752 patients with type 2 diabetes mellitus from January2021 to June 2021 in the hospital were the validation set.Descriptive research methods were used to analyze the basic characteristics of the subjects and the prevalence of DR.The single factor logistic regression analysis was used to explore the correlation strength between each prediction factor and DR,and the DR prediction factors were screened through the least absolute contraction and selection operation(LASSO)regression.Three machine learning algorithms,BP neural network,decision tree and random forest,were used to construct DR clinical prediction models respectively.Draw the receiver operating characteristic(ROC)curve to compare the predictive efficiency of each model.Multi-factor Logistic regression was used to quantify the machine learning model with the best prediction efficiency,and its visualization was performed by nomogram.The ROC curve was to evaluate the differentiation of the DR nomogram model,the consistency of the nomogram model was assessed by the calibration calibration curve and Hosmer-Lemeshow test,and the clinical effectiveness of the nomogram was evaluated by the decision curve analysis(DCA)and the clinical impact curve(CIC).Results 1.A total of 5148 subjects were included in this study,including 2605 males and 2543 females.The detection rate of DR in the total population was 15.35%.The detection rate of DR in people aged 70 and above was the highest(18.32%),and increased with age(P<0.05).The detection rate of DR in smokers(17.38%)was higher than that in non-smokers(P<0.05).The detection rate of DR was the highest in people with a course of diabetes of 15 years or more(32.77%),and the detection rate of DR increased with the increase of the course of disease(P<0.05).The detection rates of DR In patients with diabetic peripheral neuropathy,diabetic nephropathy,diabetic foot,hypertension,dyslipidemia,coronary heart disease and cerebral infarction were 32.82%,38.07%,37.00%,17.55%,19.41%,20.84%,18.49%,respectively,which were higher than those without the above diseases(P<0.01).The detection rate of DR In patients using hypoglycemic drugs,antihypertensive drugs,lipid-regulating drugs,nitrate drugs and aspirin was higher than that in those who did not use these drugs(P<0.01).2.BP neural network model finally included predictors of α-hydroxybutyrate dehydrogenase(α-HBDH),lactate dehydrogenase(LDH),glycosylated albumin(GA),apolipoprotein B(apo B),indirect bilirubin(IBIL),blood urea nitrogen(BUN),waist circumference and estimated glomerular filtration rate(e GFR).The area under ROC curve(AUC)of the model in the training set was 0.791(95%CI: 0.777,0.804),the diagnostic cut-off value was 0.192,and the AUC value in the test set was 0.754(95%CI:0.732,0.776),the sensitivity was 66.12%,the specificity was 71.98%,the AUC value in the verification set was 0.763(95%CI: 0.761,0.803),the sensitivity was 65.79%,the specificity was 79.31%.3.The decision tree model finally included predictors of diabetic peripheral neuropathy,GA,high-sensitivity C-reactive protein(hs-CRP),diabetic nephropathy,LDH and the course of diabetes.The AUC value of the model in the training set was0.824(95% CI: 0.812,0.837),the diagnostic cut-off value was 0.120,the AUC value in the test set was 0.797,the sensitivity was 74.69%,the specificity was 69.44%,and the AUC value in the verification set was 0.792(95%CI: 0.771,0.812),the sensitivity was 71.05%,and the specificity was 73.27%.4.The random forest model finally included predictors of diabetic peripheral neuropathy,GA,LDH,diabetes nephropathy,diabetes course,diabetes foot,IBIL and hypoglycemic drug.The AUC value of the model in the training set was 0.825(95%CI: 0.812,0.837),the diagnostic cut-off value was 0.179,the AUC value in the test set was 0.815(95% CI: 0.795,0.834),the sensitivity was 84.19%,the specificity was65.13%,the AUC value in the verification set was 0.796(95% CI: 0.765,0.824),the sensitivity was 71.05%,and the specificity was 77.74%.Among the three DR prediction models of BP neural network,decision tree and random forest,the random forest model had the best prediction performance.5.The predictors screened by random forest algorithm were incorporated into the multivariate logistic regression analysis,and the results showed that 7 factors including diabetic peripheral neuropathy,diabetic nephropathy,diabetic foot,GA,LDH,diabetic course and hypoglycemic drug use were independent risk factors for DR,and IBIL was protective factor for DR.In the training set,the AUC value was0.828(95%CI: 0.815,0.840),and the diagnostic threshold was 0.177.The AUC value of the test set was 0.823(95%CI: 0.803,0.842),the sensitivity was 72.24%,and the specificity was 77.68%.The AUC value of the verification set was 0.828(95%CI:0.799,0.854),the sensitivity was 74.56%,and the specificity was 80.72%.The results of Calibration curve and Hosmer-Lemeshow test showed that the model had good consistency(P>0.05).DCA and CIC showed that when the risk threshold of training set was 5%~78%,the risk threshold of test set was 5%~63% and the risk threshold of verification set was 11%~66%,the nomogram model can generate greater net benefit and had higher clinical value.Conclusions 1.The detection rate of DR in patients with type 2 diabetes mellitus is 15.35%.The detection rate of DR is higher in patients aged 70 years or older,smokers,diabetic patients with 15 years or more,and pations of diabetic peripheral neuropathy,diabetic nephropathy,diabetic foot,hypertension,dyslipidemia,coronary heart disease,cerebral infarction,and patients using hypoglycemic drugs,antihypertensive drugs,lipid-controlling drugs,nitrates and aspirin.2.Diabetic peripheral neuropathy,diabetic nephropathy,diabetic foot,GA,LDH,the course of diabetes and hypoglycemic drugs are risk factors for DR,and IBIL is a protective factor for DR.3.The nomogram model based on diabetic peripheral neuropathy,GA,LDH,diabetes nephropathy,diabetes course,diabetes foot,IBIL and hypoglycemic drugs has good predictive efficacy and clinical value for the occurrence of DR in patients with type 2 diabetes. |