| Objective: Colorectal cancer(CRC)is the most prevalent gastrointestinal malignancy in the world,which has become a major global public health problem,it′s of great concern to detect colorectal cancer earlier to reduce the mortality rate of colorectal cancer.However,most of the current clinical diagnostic methods for colorectal cancer are expensive,cumbersome,radioactive,invasive and incompliant to patient.Therefore,it′s significant to develop a simple,economical and efficient method for the early diagnosis and screening of colorectal cancer.Herein,machine learning methods was applied to reveal the inherent information implied in medical detection data such as the parameters of blood routine,and the factors closely related to colorectal cancer were studied to realize the early diagnosis of colorectal cancer by constructing a prediction model.The modeling method was also verified in early diagnosis of nasopharyngeal carcinoma by a novel prediction model for nasopharyngeal carcinoma.Methods: Convenient sampling technique was employed to select study participants.Finally,327 CRC participants and 311 normal ones from hospital in Luzhou were recruited from December 2016 to December 2020.First of all,SPSS19.0 software was used for data preprocessing,variable univariate analysis and variable influence strength analysis.In addition,machine learning methods(random forest,support vector machine,etc.)were used to construct a prediction model,and ability of classification and diagnosis by the prediction model is evaluated.Results: 1.Among the traditional risk factors,age,sex,DM,alcohol consumption and HTN were dependent for colorectal cancer.LYMPH%,HCT and HGB can be used as auxiliary diagnostic indicators of CRC.2.By random forest(RF)algorithm,12 variables including LYMPH#,LYMPH%,HCT,RDW-SD,NEUT%,RBC,PDW,HGB,RDW-CV,PCT,MONO% and P-LCR were selected as data subset to construct early diagnosis model of colorectal cancer.3.Support vector machine(SVM)performance better in constructing colorectal cancer early diagnosis model,and the prediction accuracy,specificity and sensitivity were 0.841(0.811,0.871),0.830(0.786,0.875)and 0.852(0.808,0.895),respectively.The area under curve(AUC)was 0.911(0.887,0.934).4.A early prediction model of nasopharyngeal carcinoma was constructed with RF and SVM,with an accuracy of 0.943,a specificity of 0.882,a sensitivity of0.981 and AUC of 0.932.Conclusion: CRC is a prevalent malignant tumor in China,and the prevention and control over CRC is of great significance to the public and the country.The results of this study show that the classification prediction model shows higher prediction ability based on blood routine detection combined with support vector mechanism.This provides a simple and feasible method for early diagnosis of CRC.This study provides a theoretical basis for the diagnosis and prevention of CRC.At the same time,the model for early prediction of nasopharyngeal carcinoma was established by using this method,and the model exhibited high performance in providing references for the clinical screening of nasopharyngeal carcinoma. |