Font Size: a A A

Indentification Of Activated Pathways In Lung Adenocarcinoma Based On A Novel Algorithm Of Construction Of Gene Interaction Network

Posted on:2017-06-11Degree:DoctorType:Dissertation
Country:ChinaCandidate:J ChenFull Text:PDF
GTID:1314330512452727Subject:Internal Medicine
Abstract/Summary:PDF Full Text Request
Lung cancer, of which non-small-cell lung cancer (NSCLC) accounts for approximately 80%, is the most common cause of cancer-related death in both developing and developed regions. Lung adenocarcinoma, a major histological subtype of NSCLC, results from small bronchi, bronchioles or alveolar epithelial cells, and is typically peripherally located as reviewed elsewhere. It is the cause for almost 50% of deaths attributable to lung cancer.Bioinformatics is an interdisciplinary science combining biology, statistics with computer science, and now is widely used in the analysis of huge amounts of biological data. With the development of bioinformatics, a novel model of biology research rises, which consists of two steps:firstly, theory prediction using existing data reserved in open access database; secondly, experimental verification using actual data. Recently, researchers pay more attention to gene expression profiles study, pathway analysis and target therapy of complex disease. Studying the disease on molecular level is crucial for guiding the prevention, diagnosis and treatment of lung adenocarcinoma.In the present study, we retrieved the microarray data of lung adenocarcinoma from ArrayExpress database and identified the differentially expressed (DE) genes by comparing the gene expression data in lung adenocarcinoma with that in normal controls. Then, we combined four existing network methods, including the search tool for the retrieval of interacting genes/proteins database (STRING), the differentially coexpressed genes and links package (DCGL), the empirical Bayesian (EB) meta analysis approach and the weighted gene coexpression network analysis (WGCNA) package, to construct a novel rank based algorithm using a combined score, which was defined as combined method. Kyoto Encyclopedia of Genes and Genomes (KEGG) pathway analysis of dysregulated genes was performed based on Expression Analysis Systematic Explorer (EASE) test to illuminate the biological pathways. Pathway activity analysis was conducted to compute distribution of pathways in different stages (?A, ?B, ?A, ?B, ?A, ?B and ?) and to identify activated pathways in lung adenocarcinoma. This work will be conducive to revealing the underlying pathogenesis and molecular biomarkers associated with the progress of lung adenocarcinoma, and provides novel insights on studying the development, diagnosis and treatment of lung adenocarcinoma.Part1Construction of protein interaction network involved in lung adenocarcinomas using a novel algorithmBackground:Lung cancer is the most common cause of cancer-related death worldwide [1]. Lung adenocarcinoma is a major histological subtype of lung cancer, and has increased incidence over the past years. Studies that only assess DE genes do not contain the information required to investigate the mechanisms of diseases. A complete knowledge of all the direct and indirect interactions between proteins may act as a significant benchmark in the process of forming a comprehensive description of cellular mechanisms and functions. The results of protein interaction network studies are often inconsistent and are based on various methods.Objective:In the present study, a combined network was constructed using selected gene pairs, following the conversion and combination of the scores of gene pairs that were obtained across multiple approaches by a novel algorithm. Hub genes and pathways associated with lung adenocarcinoma from the combined co-expression network were identified.Methods:Samples from patients with and without lung adenocarcinoma were compared, and the RankProd package was used to identify DE genes. Next, STRING package, DCGL, EB approach and WGCNA package were used for network construction. A combined network was also constructed with a novel rank based algorithm using a combined score. The topological features of the 5 networks were analyzed and compared. Functional enrichment analysis was performed to identify the pathway based on the gene pairs from five methods.Results:A total of 941 DE genes were screened in lung adenocarcinomas, including 386 up-regulated genes and 555 down-regulated genes. Using 4 exist methods (STRING, DCGL, EB, WGCNA) and the novel method (combined method) we presented,5 co-expression networks were constructed. We compared the topological structures of five co-expression network, and found that 4 networks exhibited the scale-free property, with a degree distribution that follows the power law with high fitting coefficients (0.931,0.938,0.963, and 0.977, respectively), with the exception of the network constructed using the WGCNA method (0.264). We also calculated the mean shortest path length for five networks (5.337,2.715,3.673,1.783,4.195, respectively). The topological analysis indicated that the gene interaction network constructed using the WGCNA method was more likely to produce a small world property, which has a small average shortest path length and a large clustering coefficient, whereas the combined network was confirmed to be a scale free network. 15 hub genes were identified from the combined co-expression network, including 12 up-regulated genes (TOP2A?PAICS?BUB1?ADAM12?FGB?NONO?UGT8? SRPX2?AOC1?AURKA?NCAPG?RACGAP1) and 3 down-regulated genes (IL1RL1?TACC1?DARC). The DE genes were significantly enriched in 7 pathways, including extracellular matrix receptor interaction, cell adhesion molecules, p53 signaling pathway, focal adhesion, vascular smooth muscle contraction, cell cycle, and complement and coagulation cascades. Gene pairs that were identified using the novel combined method were mostly enriched in the cell cycle and p53 signaling pathway. The common pathway that gene pairs enriched across the 5 methods was the cell cycle.Conclusion:In the present study, we successfully identified 941 DE genes in lung adenocarcinoma. By reassembling the scores of gene pairs from 4 existing methods, we successfully constructed a combined network, and the combined network was revealed to demonstrate scale free network features, indicating increased robustness against the random failure of the network compared with the other networks. The network constructed by the WGCNA method was more inclined to be a small world property, enabling a rapid integration of information. Moreover, topological analysis identified 15 hub genes from the combined co-expression network, which might play important roles in the development of lung adenocarcinoma. Pathway enrichment analysis indicated that cell cycle pathway might be related to the pathogenesis of lung adenocarcinoma.Part 2Identification of activated pathways in lung adenocarcinomaBackground:Lung adenocarcinoma has increased incidence and mortality over the past years. Most patients with lung adenocarcinoma present very poor prognosis. With the development of bioinformatics, researchers pay more attention to gene expression profiles study, pathway analysis and target therapy of lung adenocarcinoma. At present, the success of network-based pathway identification and classification supports the notion that cancer is indeed a'disease of pathways'. There are two status for pathways:activated and nonactivated. The activated pathways might play more important roles in complex disease relative to nonactivated ones. Thus, we identify the activated pathways in the development of lung adenocarcinoma.Objective:The objective of this paper is to identify activated pathways in the development of lung adenocarcinoma based on gene co-expression network analysis, and identify the underlying molecular biomarkers contributing to the early diagnosis and therapy of lung adenocarcinoma.Methods:The microarray expression profiles of patients with various stages of lung adenocarcinoma were downloaded from Array Express database. Then, the DE genes in lung adenocarcinoma were identified by the RankProd package. KEGG pathway analysis of dysregulated genes was performed based on EASE test to illuminate the biological pathways. Co-expression networks of lung adenocarcinoma in different tumor stages (IA, IB, IIA, IIB, IIIA, IIIB and IV) were constructed by the novel combined approach. Pathway activity analysis was conducted to compute distribution of pathways in different stages and to identify "activated" pathways in which the permutation strategy was used.Results:In this work, we evaluated 211 dysregulated genes between lung adenocarcinoma patients and normal controls. Based on gene expression values of these dysregulated genes in different stages, we constructed 7 co-expression networks for different tumor stages (?A, ?B, ?A, ?B, ?A, ?B and ?). In these co-expression networks, the interaction number is non-regularity. Based on the co-expression networks (stage ?A, ?B, ?A, ?B, ?A, ?B and ?), pathway activity analysis was performed and 10 common pathways were identified, including cell cycle, progesterone-mediated oocyte maturation, oocyte meiosis, ECM-receptor interaction, vascular smooth muscle contraction, neuroactive ligand-receptor interaction, pathways in cancer, p53 signaling pathway, renin-angiotensin system, and renal cell carcinoma. Pathway activity analysis showed that cell cycle, progesterone-mediated oocyte maturation and oocyte meiosis were activated during all stages in lung adenocarcinoma. Meanwhile, p53 signaling pathway and pathways in cancer were activated in most stages of lung adenocarcinoma except for IIIA. In addition, renin-angiotensin system pathway was nonactivated.Conclusions:Based on dysregulated genes, we identified 10 common pathways related to the development of lung adenocarcinoma. We successfully identified 3 activated pathways (cell cycle, progesterone-mediated oocyte maturation and oocyte meiosis) in different stages (?A, ?B, ?A, ?B, ?A, ?B and ?) of lung adenocarcinoma, which might contribute to the diagnosis and therapy of lung adenocarcinoma.
Keywords/Search Tags:lung adenocarcinoma, gene interaction, network, topological analysis, stage, pathway, activate
PDF Full Text Request
Related items