| Background: Colorectal cancer(CRC)is one of the most common malignant tumors.Globally,it has the third highest incidence of all malignancies and the second highest mortality rate.In China,colorectal cancer is the second malignant tumor and the mortality rate has also risen to the top five.The occurrence and development of colorectal cancer is the cumulative result of a series of molecular events under the combined effect of environmental and genetic factors.Colorectal cancer has different molecular types according to different molecular characteristics.Molecular types of colorectal cancer currently include microsatellite instability(MSI)subtype,chromosome instability(CIN)subtype and Cp G island methylation phenotype(CIMP).Due to the significant biological heterogeneity of colorectal cancer,which not only has individual differences,but also has complex temporal and spatial heterogeneity,the prognosis of different patients with pathologically same tumor is distinctly different,resulting in existing treatment outcomes are poor.Therefore,the establishment of tumor typing system based on the difference of molecular characteristics is conducive to personalized treatment and accurate prediction of prognosis.R-loop is a three-strand DNA/RNA hybrid formed during transcription,and its metabolic disorder is closely related to genomic instability.The accurate and dynamic regulation of R-loops by R-Loops binding proteins(RLBPs)plays an important role in genome stability and cell homeostasis,while imbalance of its biogenesis and regulation can lead to nervous system diseases,autoimmune diseases and tumors.Since genomic instability is an important feature of colorectal cancer,this study intends to explore the use of R-loop binding protein for molecular typing,to evaluate its significance in the prognosis and treatment prediction for colorectal cancer,potentially providing novel options for precise diagnosis and treatment of colorectal cancer in clinical practice.Methods: According to previous literature reports,204 RLBPs were analyzed and screened.We performed non-negative matrix factorization(NMF)-based unsupervised clustering using 204 RLBPs in 2014 CPTAC CRC databases,the tumor samples can be classified into two subtypes with a distinct clinical outcome.To further facilitate stratification,we established clusting model using the random forest algorithm to classify2019 CPTAC or 2020 Cancer Cell CRC databases and then matched subtypes assigned by the random forest model to NMF subtypes.Further correlation analysis was conducted with the clinical features,pathological types,molecular characteristics,immune features and drug therapy of colorectal cancer.Next,the effect of overexpression of EMG1 on DNA damage repair and genome instability were detected by neutral comet assay and western blotting(WB)and immunofluorescence staining ofγH2AX foci.Through a 405 nm laser micro-irradiation system to induce DNA damage and immunofluorescence technology(IF)was employed to test whether EMG1 was directly involved in DNA damage repair and how to regulate genome stability.The domain and upstream regulatory molecules of EMG1 involved in DNA damage repair were also detected by 405 nm laser micro-irradiation.Finally,we screened a series of chemotherapeutic drugs in the EMG1 overexpressed colorectal cancer cells for chemo-sensitization.Results: 1.Based on the expxpresion level and types of RLBPs expressed in CRC,the tumors were classified for two major types: CI,RLBPs is highly expressed,having a better prognosis.While CII(d RLBPs,RLBPs-deficient)had a worse prognosis.2.CI subtype was more manifested as chromosome instability,high expression of DNA Damage Repair(DDR)related genes,active RNA metabolic signaling pathways and sensitive to the treatment of EGFR inhibitors and genomic stability drug therapy.CII type showed abundant infiltration of lymphocytes and macrophages,and active inflammatory signaling pathways.3.DDX21 was identified as a marker for predicting drug sensitivity,showing that the tumors with high expression of DDX21 are more sensitive to anti-EGFR drugs and genome-stabilizing drugs.4.42R-loops binding proteins are highly expressed in colorectal cancer,screened by netrual comet assay,we found that the high expression of EMG1 caused an increase in double-stranded DNA breaks.Tumors with high expression of EMG1 would increase the sensitivity to DNA damage drugs.Further studies showed that EMG1 is directly involved in DNA damage repair through the C-terminal and is regulated by both PARP1 and ATR.Overexpression of EMG1 promoted the increase of DNA breakage,delayed the completion of DNA damage repair process,and promoted genomic instability.Conclusion: In this study,RLBPs are found to be a marker of molecular typing in patients with colorectal cancer,guiding the prognosis and treatment of patients with colorectal cancer.The two clusters correlated with CIN phenotypes,anatomical location and pathological status.Furthermore,the two clusters were distinguished by physiological features and tumor microenvironment(TME).In addition,we also identified 42 tumor-related RLBPs,the abnormal expression of R-loops binding proteins may cause the abnormal process of DNA damage repair,thus leading to the instability of the genome.We revealed that the clusters with high expression of tumor-related RLBPs displayed different drug sensitivity involved EGFR and genome stability pathways.Thus,our establishen associations of RLBPs with cancer provide solid basis for further functional investigation of R-loop binding proteins in cancer.The RLBPs clustering also bridges a gap between CIN and RLBPs,facilitating personized diagnosis and appropriate selection for therapy in the clinical application. |