Font Size: a A A

Research On Tumor Classification Based On Integrative Omics Analysis

Posted on:2018-07-01Degree:DoctorType:Dissertation
Country:ChinaCandidate:Z C LiFull Text:PDF
GTID:1314330518965221Subject:Biochemistry and Molecular Biology
Abstract/Summary:PDF Full Text Request
One of the major diseases that threaten human health is cancer,of which the incidence was on the rise in recent years,the mortality had been high,and the population tend to be younger.Therefore,enhancement of the diagnosis and treatment of cancer is the most urgent and important issue among current researches in the life sciences and medicine.Traditionally,tumor classification,grade and stage are determined by histopathological features and used to predict patient prognosis and guide treatment in clinic.It can be seen that accurate tumor classification plays a vital role for clinical diagnosis and treatment.However,the traditional tumor classification is not so accurate that the prognosis of the patients with the same subtype and the same treatment is often considerably different.That is,the traditional tumor classification based on pathological features is far from meeting the need of precise diagnosis and treatment of tumor.In fact,cancer is a heterogeneous disease and tumor cells usually encountered abnormal changes in genome,transcriptome,epigenome and proteome concurrently.Tumor molecular classification was based on intrinsic molecular characteristics of the tumor cells,which can be more objective to reflect the nature of tumor development.Researches on tumor molecular classification will not only provide more accurate tumor classification in clinical,but also help to understand the molecular mechanisms of different subtypes,guide clinical treatment and predict patients’ prognosis.Tumor molecular classification will be the cornerstone of personalized treatment and an important basis for promoting precision medicine.Due to development of high-throughput sequencing technology,we have accumulated a lot of omics data related to diseases,especially cancer.Subtyping on a single-omics data has made some achievements in cancer subtyping,particularly the breast cancer transcriptome classification system which has been clinically proven.As we know,tumor heterogeneity is reflected in each omics level,such as genome,transcriptome,proteome and epigenome.Any single-omics data can only reflect the intrinsic molecular characteristics of the tumor from a single perspective,however,integration of multiple omics can capture information on different omics heterogeneity,and identify more accurate tumor molecular classification.So comprehensive understanding of cancer from multiple omics levels has become a new trend in cancer research.During the last decade,TCGA project included a total of 34 kinds of tumors,more than 10,000 patient specimens’ multiple omics data.These abundant omics data of tumors lay the data foundation for understanding the development and progression of tumors from multiple omics perspectives,as well as bring opportunities and challenges of integrative analysis of multi-omics data.Here,we hope to explore tumor classification based on integrative omics analysis from different methods of integrative analysis.In this paper,we carried out molecular subtyping of tumors using network-based integration and data-based integration.Firstly,we performed tumor molecular subtyping with the help of NBS,an existing analytical method for integration of molecular network and omics data,and the introduction of lncRNAs.The core of network-based integration analysis is fused information of the molecular network and omics data,which involves the network propagation algorithm for data fusion and unsupervised clustering for classification.NBS successfully identified clinically relevant tumor molecular subtypes by integration of protein interaction network information and gene mutations data.However,the protein interaction network contains only protein-coding genes without information of the important role of non-coding RNA in tumors.In fact,lncRNAs are an important class of regulation non-coding RNA discovered in recent years,and many articles have reported that lncRNAs are closely related to tumor development.To systematically analyze the relationship between protein-coding genes and lncRNAs,we build an lncRNA-protein association network by gene co-expression analysis,based on which we integrate proteome data for tumor molecular subtyping in breast cancer.Here,taking the TCGA’s BRCA population for example,we firstly built a breast cancer-specific lncRNA-protein association network using transcriptome expression profile data,secondly integrated RPPA expression data of the same cohort using NBS framework,and finally based on fused data matrix we identified six subtypes of breast cancer using consensus NMF method.The subtyping result shows high consistence with the known PAM50 subtypes and ER/PR/HER2 biomarkers.Further analysis on protein expression shows protein expression patterns are significantly different among different subtypes.All these results demonstrate that the integrative analysis method based on lncRNA-protein associated network can effectively identify tumor classification with significant molecular characteristics.Secondly,we proposed a new method to integrate multi-omics data types for unsupervised clustering analysis and applied it to establish tumor classification system.Methods for integration of multi-omics data should overcome some problems,such as the small number of samples compared to the large number of features,the differences in scale,collection bias and noise in data sets from different sources,the complementary information from different omics data simultaneously and so on.Transcriptome analysis reveals the molecular characteristics of tumors and has been widely used in cancer studies.In this paper,we first proposed a new clustering method for integrating multi-omics data named by ICC.Our ICC approach integrates multiple data types by transforming the information of each data type into a patient similarity matrix and merging multiple patient similarity matrixes into a fused patient similarity matrix.The final clustering result is determined by consensus clustering on fused patient similarity matrix.One advantage of the method is that it does not require normalization of data across multiple types or platforms prior to integrating them.Any data type can be integrated into analysis,including gene mutations,CNA,DNA methylation and gene/miRNA/protein expression profiles but can effectively integrate consistency and complementarity of multi-omics data sets.Next,we applied ICC to mRNA,miRNA and lncRNA expression data for the cohort of 431 TCGA ccRCCs and identified five robust subtypes that were associated with clinicopathologic features,genomic aberrations and molecular expression patterns.Moreover,the integrated transcriptome classification system accurately identified 19 out of 20 samples misdiagnosed as ccRCCs included in our study,demonstrating the ability and accuracy of our ICC to distinguish other RCCs from ccRCCs.All these results show that the integration of protein-coding and non-coding RNA for molecular classification of tumors may be important and demonstrate that our ICC method is effective.Finally,we summarized this paper and gave some prospects.This paper focused on tumor molecular classification by integration of multiple omics data types,such as integration of molecular network information and omics data,integration of multi-omics data and so on.With consideration of the important regulation functions of non-coding RNAs in tumor development,we introduce the miRNAs and/or lncRNAs.We integrated lncRNA-protein associated network and proteome data to identify 6 breast cancer subtypes,and integrated protein-coding and non-coding RNA expression data to identify 4 ccRCC integrated transcriptome subtypes using ICC proposed for the first time.Further analysis results show that two integrated analysis methods,as described in this paper,are able to effectively identify tumor molecular subtypes with clinical relevance and distinct molecular characteristics.
Keywords/Search Tags:non-coding RNA, integrative omics, tumor molecular classification
PDF Full Text Request
Related items