Font Size: a A A

Research On Cancer Subtype Clustering Based On Stacked Autoencoder

Posted on:2022-02-14Degree:MasterType:Thesis
Country:ChinaCandidate:B ZhangFull Text:PDF
GTID:2504306542463154Subject:Computer technology
Abstract/Summary:PDF Full Text Request
Cancer is a threat to the health and life of the whole human being.The prevention and treatment of cancer is also the subject that the medical scientists are working hard to study at present.With the progress and development of modern genomics sequencing technology,the accumulation of omics data has provided opportunities and challenges for people to analyze the pathogenesis of cancer in a comprehensive and multi-level way.In the process of bioinformatics research,the discovery of cancer subtypes has become one of the hot fields.Multi-omics data can be used to divide the same cancer into different molecular subtypes,so as to provide basis and guidance for personalized diagnosis and treatment of cancer,and thus greatly improve the efficiency of cancer diagnosis and treatment.Clustering is one of the techniques to achieve this goal.From the perspective of clustering results,multi-omics data clustering is often better than single-omics data clustering,but multi-omics data has the characteristics of fewer samples and higher dimensions,which has always been an important challenge for molecular technology research on cancer.In addition,the choice of fusion strategy has an important effect on clustering results in multi-omics clustering.In this paper,the dimension-reduction of multi-omics data was carried out based on stacked autoencoder neural network,and the prediction model of cancer subtypes was constructed.In summary,this paper based on stacked autoencoder to reduce the dimension of multi-omics data,build a cancer subtype prediction model,and analyze its clinical significance.The main work contents are as follows:1.A method of cancer subtype discovery based on stacked autoencoder is proposed.First,the raw data features of the samples are input into a three-layer stacked autoencoder to obtain a low-dimensional representation of the raw data features.Then,the low dimensional features of different omics were splicing together by using the scale exponent similar kernel function to form a similarity network.Finally,cancer subtypes were identified on the similarity network based on spectral clustering algorithm.Compared with previous dimension-reduction methods,this method uses a stacked autoencoder to obtain more meaningful low-dimensional potential representation.Compared with the direct combination of genomic multi-omics,this method is easier to alleviate the deviation of measurement difference in the integration process of genomic multi-omics.2.An optimization method of data fusion based on stacked autocoding is proposed.Stacked autoencoder can reveal potential nonlinear subspaces and extract more meaningful low-dimensional potential representations.After dimensionality reduction,the multi-omics feature representations are simply and directly spliced together,and the fusion method is relatively simple,without considering the different data distribution of omics data.In this paper,on the basis of dimensionality reduction of stacked autoencoder,the low-dimensional feature representation after dimensionality reduction is fused by using the relative similarity of network.The experimental results show that our method is competitive with other existing methods in the clustering of cancer subtypes in multi-omics.In the analysis of GBM(glioblastoma multiforme),this method found subtypes with large differences in drug response and age distribution.
Keywords/Search Tags:Fusion Optimization, Cancer Subtypes, Stacked Autoencoder, Multi-Omics Data
PDF Full Text Request
Related items