Font Size: a A A

Research On Application Of Multi-omics Data Fusion Algorithm Based On Heterogeneous Graph Neural Network In Tumor Classification

Posted on:2022-11-07Degree:MasterType:Thesis
Country:ChinaCandidate:W B QiaoFull Text:PDF
GTID:2504306761459424Subject:Automation Technology
Abstract/Summary:PDF Full Text Request
In recent years,with the decreasing cost and rapid development of high-throughput sequencing technologies,more and more public databases containing high-quality diverse omics data have been developed.Therefore,the research on omics data by researchers in the field of bioinformatics has also developed from the original use of single omics data to the simultaneous use of multiple omics data.At the same time,cancer grading and subtypes,as a complex trait,have different clinical,pathological and molecular features,and have prognostic and therapeutic implications.Therefore,research on cancer grading and subtype is of great significance for precision medicine and cancer prognosis prediction.Although many researchers have begun to study the classification and prediction of cancer differentiation and subtype,many related methods are based on traditional machine learning,and most of them are based on single-omics data.However,there are not only few methods based on multi-omics data integration,but also the results need to be improved.Therefore,it is necessary to study a deep learning algorithm based on multi-omics data integration to realize the classification prediction of cancer grade and subtype.In this paper,we propose a Multi-Omics data fusion algorithm based on Heterogeneous Graph Neural Networks(MOHGNN)for classifying cancer grade and subtype.The framework of the model is mainly composed of a Graph Convolutional Network(GCN)module for learning different omics data features and a Graph Attention Network(GAT)module for multi-omics data integration.For each omics data,firstly,we respectively performed feature selection on different omics data using algorithms such as chi-square test and minimum redundancy maximum correlation(m RMR).Then,weighted patient similarity networks were constructed according to different omics features and GCNs were trained using the omics features and the corresponding similarity networks.Finally,GAT is used to integrate different types of omics features and make final cancer classification predictions.MOHGNN is an end-to-end model where all network modules are trained together.In order to verify the cancer classification prediction performance of the MOHGNN model,firstly,we used a 5-fold cross-validation method to compare our model with traditional machine learning models and current popular methods based on multi-omics data integration.The experimental results obtained by our model are good both in the prediction of the binary classification results of cancer differentiation and subtype,and in the prediction of multi-classification results of cancer differentiation and subtype.The model achieved an average ACC of 91.8% and ACC of 73.5% on any two-class and multi-class predictions of breast cancer subtypes,respectively.Then,in order to select the modules in the model that are more conducive to cancer differentiation and subtype classification,we also used 5-fold cross-validation to test the predictive performance of different modules on the test set.Finally,to further test the classification performance of the model,we compared the cancer differentiation and subtype based on using only a single omics data,using two omics data simultaneously,and using three omics data simultaneously,The MOHGNN model has the best cancer classification prediction performance based on multiple omics data.
Keywords/Search Tags:Multi-omics Data Integration, Cancer Differentiation, Cancer Subtypes, Heterogeneous Graph Neural Networks, Feature Selection
PDF Full Text Request
Related items