| With its high incidence and low cure rate,cancer has become one of the most deadly diseases in the world,and it is urgent to treat it effectively.However,cancer is heterogeneous,and patients with the same type of cancer may have different clinical manifestations.Dividing cancer patients into different subtypes allows personalized treatment for heterogeneous patients.Which is crucial for the treatment and prognosis of patients.At present,according to the amount of sample group data used,the classification of cancer subtypes can be divided into two research forms: single omics data and multi-omics data.In this paper,two algorithms are developed based on the problems existing in these two research forms:(1)Considering that most single omics cancer subtype classification methods only extract the characteristics of samples to detect cancer subtypes,but ignore the correlation between samples.We propose a residual graph convolution model based on sample similarity network to identify cancer subtypes.First,we construct a sample similarity network based on the gene expression data of the sample.Then,the gene expression data of cancer samples are transferred to the two-level graph convolution network(GCN)model as initial features and sample similarity network.In addition,we introduce initial features as residuals in the GCN model to avoid over-smoothing in the training process.Finally,the classification of cancer subtypes was obtained by softmax activation function.(2)Omics data usually have the characteristics of diversity and high dimensionality.There are differences in the distribution of different omics data.How to effectively integrate multiple omics data to accurately classify cancer subtypes is a challenge for researchers.Considering that most of the existing methods for cancer subtype classification based on the fusion of multi-omics data do not realize the interaction between the features of different omics data,we design a method to identify cancer subtypes based on supervised graph contrast learning and fusion of multi-omics data.This method uses gene expression,mi RNA expression,and DNA methylation data as sample characteristics.Firstly,the method calculates the Pearson correlation coefficient between samples according to the different histological data of samples to construct multiple sample adjacency matrices;Then,put the different omics data and adjacency matrix of the sample into different residual graph convolution models,and then get the unique characteristics of each omics through two different full connection layers;The next through the supervised comparison loss,the same type of samples with different histologic characteristics are similar,but different types of samples with different histologic characteristics are far away,so as to achieve the interaction of different histologic characteristics and obtain the multiple histologic characteristics of samples.Finally,the cancer subtype classification is obtained by combining the multi-group characteristics of the samples and using the classifier algorithm.We applied these two models to the data sets of invasive breast cancer(BRCA)and glioblastoma multiforme(GBM).Compared with other methods of monoomics and multi-omics fusion,our model has achieved better results in evaluation indicators than other methods.Additionally,the results of the survival analysis experiment prove that the cancer subtypes identified by our model have significant clinical characteristics.Furthermore,we conducted GO enrichment and KEGG pathway analysis experiments using our model. |