Font Size: a A A

CM-predict: A Classification Framework For Cancer Metastasis Based On Co-expression Network

Posted on:2022-11-25Degree:MasterType:Thesis
Country:ChinaCandidate:J B HeFull Text:PDF
GTID:2480306758980199Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
Metastasis is a long-standing problem in cancer research,and the elaboration of metastatic mechanisms in cancer is of great importance for the clinical diagnosis of cancer and the treatment of patients with advanced cancer.As one of the most lethal processes in cancer development,approximately 90% of patients die from the metastatic stage of cancer and often show a preference for certain organs,with the brain,bone,liver,and lung being the four most common metastatic organs.Differentially expressed genes are not conserved and specific,making it difficult to explain complex cancer mechanisms at the systems level and lacking a holistic view of system-level properties.Thanks to advances in gene sequencing technology and network modeling techniques,network biology-based analysis methods have been widely used in various scientific studies of cancer.The study and application of gene co-expression networks have allowed us to deepen our understanding of cancer at the system level and have provided new insights into the mechanisms of cancer metastasis to specific organs.To investigate the mechanism of cancer metastasis,we propose a method to predict the metastatic organs of cancer,CM-Predict,which takes advantage of the consistency of gene co-expression inhomogeneous cancer samples to build a framework for feature extraction using reference networks and perturbation networks and use the extracted features to classify metastatic cancer samples.The principle of this method is that for a specific gene co-expression network,adding a sample of the "same class" should not significantly change the co-expression level of the network;on the contrary,adding a sample of another class will cause a large perturbation to the original network and thus change the structure of the network.As a comparison,we compared CM-Predict with other classification methods.For the cancer transfer classification task,CM-Predict significantly outperformed the other four machine learning methods in the BLCA,ESCA,and LIHC datasets.The prediction accuracy of CM-Predict can reach 0.986,0.964,and 0.933 in BLCA,ESCA,and LIHC datasets,respectively,with high accuracy,and the model has a high recall and f1-score.The results on the three datasets showed that the method achieved the best results,which makes CM-Predict an effective method for predicting tumor metastasis.Secondly,we proposed a statistical evaluation method based on the signature gene pairs obtained by CM-Predict to enrich the signature gene pairs screened by the model to obtain statistically and biologically significant pathways.The target set of the enrichment analysis is the signature gene pairs obtained by CM-Predict,and based on the enrich GO enrichment analysis,the statistical information of the signature gene pairs and the pathways enriched by enrich GO is used to obtain statistically significant biological pathways using hypergeometric distribution plus fisher test.The method can effectively perform pathway enrichment analysis of the signature gene pairs for each metastatic organ in the three cancers,and the results show that the method can capture the key biological processes of metastasis from cancer primary cancers to different organs,providing a new means to elucidate the metastatic mechanism of cancer.
Keywords/Search Tags:Cancer Metastasis, Machine Learning, Co-expression Network, Functional Enrichment Analysis, Data Mining
PDF Full Text Request
Related items