Font Size: a A A

Prognosis Classification Prediction Model Of Primary Colorecttal Malignancy With Radical Operation

Posted on:2018-04-10Degree:MasterType:Thesis
Country:ChinaCandidate:C F DuFull Text:PDF
GTID:2334330536472247Subject:Epidemiology and Health Statistics
Abstract/Summary:PDF Full Text Request
Colorectal cancer a common digestive tract malignant tumor had become an important disease threat to human health because of its rise in morbidity and mortality.The main treatment of CRC was surgery,but still hard to avoid the recurrence or transfer risk of postoperative,once the post-operation relapse and metastasis taking place,it would seriously affect the prognosis of patients,therefore,the prognosis prediction of postoperative patients was particularly important.Current researches have studied relative content,but the main methods were the traditional multivariable logistic regression and COX regression,the above two methods required big sample size,unsuitable for high-dimensional data analysis which is common in medical science.In recent years,the rise of machine learning algorithms,such as support vector machine(SVM)based on statistical learning theory,the random forest algorithm(RF)are suitable for small sample and high dimension data classification problem.You can get a better generalization ability model combined with feature selection algorithm which can reduce the feature space redundancy and the training cost at the same time.Boruta algorithm also can avoid the relationship of variables by measuring the importance of characteristics to outcomes.,It is applicable to medical data,but used less in medical data.This research divided into two chapters,The first part adopts the UCI standard data sets to simulate forecast,traditional difference analysis was analyzed by SPSS22.0,Boruta feature selection,the establishment of the SVM model and RF model were under R 3.30,then the model prediction effect was compared by Stata14.0.The second part conduct verification based on the CRC dataset under the models screened by part one.Results show:(1)The model results of UCI standard data sets shows that the forecast model of RF model(AUC = 0.717)based on whole dataset performance good,the better model based on the difference analysis dimension reduction is Polynomial-SVM model(AUC=0.756),the better model based on Boruta feature selection dimension reduction is RF model(AUC=0.905).The comparisons of ROC curves show that,there was significant differences among three different dimension reduction method(2x = 7.27,P = 0.026).(2)Single factor variance analysis of CRC data shows,tumor site,CA-199,CEA,infiltration depth and nerve invasion,vascular invasion,T stage,N stage,Dukes staging,intra-operative radiotherapy,postoperative chemotherapy and the number of positive lymph node are significant differences between the prognosis groups(P < 0.05).According to the Boruta screening method,CA-199,the number of positive lymph node,nerve infiltration,operation time,chemotherapy,chemotherapy times were the important factors of the outcome.(3)Postoperative colorectal cancer risk prediction model comparisons shows that Polynomial-SVM model of the whole data set works best(AUC = 0.907),Polynomial-SVM of difference analysis dimension reduction method is also do good job(AUC = 0.911),Among the Boruta selecting model,the RF model(AUC = 0.982)is better,the three method has significant differences(2x =7.74,P = 0.021).(4)the COX proportional hazards models showed that the high CA – 199 group(RR = 2.002,95% CI: 1.143 ~ 3.505),the number of positive lymph(RR = 1.244,95% CI: 1.141 ~ 1.357),nerve infiltration(RR = 2.206,95% CI: 1.130 ~ 4.308),intra-operative radiotherapy(RR=2.098,95%CI:1.191~3.696)maybe the risk factors for postoperative outcome of colorectal cancer.Boruta variable screening model is better than the model based on traditional difference analysis on prediction performance,Boruta feature selection algorithm can be selected as a means of data dimension reduction in the study of actual,both to reduce the learning time and space complexity of the model,and also improve the prediction effect.RF model based on Boruta algorithm can predict postoperative prognosis to help doctors conduct preoperative intervention.
Keywords/Search Tags:Primary colorectal malignancy, Risk prediction, Feature selection, Machine learning
PDF Full Text Request
Related items