| Colorectal cancer(CRC)is one of the most commonly diagnosed cancer and the leading cause of cancer-related mortality in China,whose 5-years survival rate is about 50%.In China,about 159,000 people per a year died due to colorectal cancer,and the CRC ranks the fifth in cancer mortality.At present,curative surgery is the priority of the treatment of non-metastatic colorectal cancer.Expect surgery,5-Fu based adjuvant chemotherapy(ACT)following curative surgery is considered the standard treatment for patients with stage II and III CRC who are at high risk of relapse,according to the therapeutic guidelines of National Comprehensive Cancer Network(NCCN)and Chinese Society of Clinical Oncology(CSCO).However,even after receiving the ACT treatment,the recurrence rate of stage II and III CRC patients with ACT is approximately 30%-50%and researchers have found that the rate of receiving ACT for stage II and III CRC patients is below 50%,primarily due to the severe adverse effects of chemotherapy.Based on this background,in2019 report of the American society of clinical oncology lists"Better Define the Patient Populations That Benefit From Postoperative(Adjuvant)Therapy"as the focus of future researches.There is no method can predict the effect of adjuvant chemotherapy for colorectal cancer patients,and no marker has been clinically used.It is difficult to predict the effect with a single molecular marker.Therefore,the combined use of multiple markers may be able to resolve this problem.Some predictive models for the effect of ACT can only be found in a limited number of studies and the predicted effects were poor.Two things account for their faults.Some studies are difficult to construct an accurate and reliable prediction model using prognostic factors but not the factors related to the effect of chemotherapy,therefore these models can hold a good accuracy but cannot be validated in external datasets.On the other hand,some researches selected differentially expressed genes(DEGs)from stage II-III drug-resistant colorectal cell lines and developed a drug corresponding score system,however,the DEGs may be irrelevant to drug sensitivity or resistance since they are simply supposed to identify the drug-induced transcription changes.In this article,we performed support machine vector with genetic algorithm to select ACT candidate genes and build a 4-genes predictive model,termed the SVM-GA model,using the gene expression profiles database.The recent studies have accomplished this result and it is worthwhile in clinical application.We performed verification test,and compared our model with the previous study in our literature,and it is interpreted in molecular mechanism,the model can help clinicians optimize their decision making.Materials and methods:We downloaded the transcriptome profiling expression values of three cohorts as a training cohort(GSE14333,GSE29621,GSE39582)from the GEO database.Meanwhile,we downloaded the mRNA sequence array expression data of CRC patients from TCGA database portal as a test cohort.The patients were obtained from the untreated,primary II-III colorectal cancer tissues,and relapsed patients with follow-up below 36 months were removed to control bias.The patients were divided into ACT-benefit and ACT-futile groups according to the treatment method and relapse-free survival(RFS)time,and after comparing of the two groups’Chi-square values by the Wilcox test,we selected DEGs between ACT-benefit and ACT-futile groups in the training cohort to build a predictive model.We carried out a KEGG pathway analysis using R cluster Profiler and a Reactome pathway analysis for those differential expression genes.We performed support vector machine(SVM)with genetic algorithm(GA)to select ACT candidate genes,and built a predictive model using gene expression profiles from the Gene Expression Omnibus database.Using Subpopulation Treatment Effect Pattern Plot(STEPP)to determine the cut-off value of predictive scores,the validated patients from The Cancer Genome Atlas database were divided into the predictive ACT-benefit/-futile groups.After patients in the test cohort were stratified into two groups according to the determined cut-off point,we used a log rank test to compare the difference in the RFS rate of patients with/without ACT between the two groups.We performed propensity score(PS)analysis in the sensitivity analysis.After comparing the expression orderings of the reported six gene pair signatures(6-GPS),the patients with at least a half of the relation expression ordering(REOs)of the set of gene pairs were stratified into the high-risk group,while the residuals were stratified into the low-risk group.Result:1)Data preprocessing and characteristics The training cohort included 568 patients from GEO database and the test cohort included138 patients from TCGA database.2)Selection of 5-Fu-based ACT candidate genesAfter performing a Wilcox test on expression values of genes in the training cohort between patients in ACT-benefit group and ACT-futile group,we identified 240 significant different expression genes(DEGs).3)Functional analysis on ACT-relevant genesThe DEGs with high expression values in the ACT-benefit group were mostly enriched in pathways relevant to MAPK and Notch,whereas genes with high expression values in the ACT-futile group were mostly enriched in pathways about Nonsense-Mediated Decay(NMD)and p53 signaling.4)Building the SVM-GA model for stage II-III colorectal cancerWith the help of SVM and GA,we constructed and optimized a predictive model by setting the TNM stage and 240 ACT candidate genes as the input variables and the information for the ACT-benefit/-futile groups as the outcome.The model contained four ACT candidate genes(EDEM1,MVD,SEMA5B,and WWP2)combined with TNM stage(training dataset AUC=0.703).5)Validation and evaluation of the SVM-GA modelUsing Subpopulation Treatment Effect Pattern Plot(STEPP)to determine the cut-off value of predictive scores,the validated patients from The Cancer Genome Atlas database were divided into the predictive ACT-benefit/-futile groups(best cut-off=0.8).Patients in the predictive ACT-benefit group with 5-Fu based ACT had significantly longer relapse-free survival compared to those without ACT(P=0.015,HR=0.345,95%CI=0.140-0.850).However,the difference in the predictive ACT-futile group was insignificant(P=0.596,HR=1.211,95%CI=0.598-2.454).The associations between ACT and the predictive ACT groups(ACT-benefit group versus ACT-futile group)regarding RFS were significant(univariable analysis P interaction=0.028;multivariable analysis P interaction=0.011).However,there was no significant association between ACT and the other characteristics.Meanwhile,we selected a decision tree algorithm containing the TNM stage and the four identified ACT candidate genes.6)Analysis of clinical characteristics We stratified patients in the predictive groups by the TNM stage and found that neither stage II nor III patients in the predictive ACT-futile group exhibited a significant difference between the patients received ACT and those received surgery only(P=0.707 and P=0.896for stage II and III patients,respectively).7)Decision treeThe predictive decision tree model included 8 rules using the normalized profiles of the selected variables.When colorectal cancer patients in stage III,WWP>-0.003,MVD>0.167,the likelihood of receiving effective ACT is 0.875.8)Sensitivity analysis73 patients in test set were matched and we remained them for validation and clinicopathologic differences were insignificant between the patients who received adjuvant chemotherapy and those who did not.In sensitivity analyses,patients who received ACT in the predictive ACT-benefit group remained significantly longer RFS than those who did not received ACT(P=0.031,HR=0.300,95%CI=0.094-0.958).9)Evaluation the effectiveness of the 6-GPS REO-based signatureWe compared the relative orderings of 6-GPS and stratified the patients into 5-Fu based high-/low-risk groups.In both the predictive 5-Fu based high-and low-risk groups,there were no significant RFS differences between the patients received ACT and those with surgery only(P=0.676 for high-risk group and P=0.222 for low-risk group).Similarly,there were also no significant RFS differences between high-and low-risk group among patients received ACT or those with surgery only(P=0.113 for patients received ACT and P=0.818 for patients with surgery only).Therefore,the 6-GPS REO-based signature was not considered suitable for the test cohort.Conclusion:In summary,we developed an SVM-GA model to predict the effect of 5-Fu based ACT on recurrence in CRC patients.This model can help clinicians optimize their decision making for CRC patients who are suitable for 5-Fu based ACT and avoid the adverse effect of chemotherapy on patients who are predicted to be ACT-futile. |