Font Size: a A A

Cross-Omics And Integrative Analysis Of The Gene Regulatory Network During Erythroid And Megakaryocytic Differentiation Of K562 Cells

Posted on:2016-11-17Degree:DoctorType:Dissertation
Country:ChinaCandidate:L WangFull Text:PDF
GTID:1220330503476666Subject:Biomedical engineering
Abstract/Summary:PDF Full Text Request
Transcriptome is important intermediate process to transfer genomic DNA codes into proteins. There are two high-throughput technologies including microarray and RNA-Seq to study transcriptome. Although microarray is cheaper than RNA-Seq, it has some disadvantages including examining the species only with known genome, high background value and the complex normalization to test differentially expressed genes. Comparing with microarray, RNA-Seq has significantly improved sensitivity and accuracy in transcriptome analysis, and has some advantages including needless probe, needless genome, high resolution, less sample volume, wide range of application and so on. RNA-Seq has been rapidly adopted for the profiling of transcriptome in many areas of biology and has revolutionized our understanding of the complexity and plasticity of regulation of gene expression. RNA-Seq can be used to analyse genetic structure (UTR and alternative splicing), genetic variation (fusion gene and cSNPs), non-coding region (lncRNA and primary miRNA) and differentially expressed genes. With next-generation sequencing technology becoming cheap and routine, RNA-Seq will replace microarray in the near future.Lineage-specific differentiation can generate various committed cells and determine cell fate in hematopoiesis. It is critical to study the mechanisms of erythroid and megakaryocytic differentiation for developing therapeutic strategies for blood disease. The human leukemia K562 cells can differentiate into erythroid lineage with hemin induction or megakaryocytic lineage with PMA induction, so it is an important model cell in studying the differentiation of hematopoiesis and leukemogenesis as well as transcriptional regulation. The gene regulatory mechanism is intricate, and combinatorial interactions of multiple biological factors are required to modulate erythroid and megakaryocytic differentiation. Here, we studied a regulatory network of divergently expressed genes bound with lineage-specific transcript factors by comparing two time-series microarray expression profiles of K562 cells during hemin-induced erythroid differentiation and PMA-induced megakaryocytic differentiation from GEO database. Since both datasets came from different research groups and had differently treated time points, we performed RNA-Seq during hemin-induced erythroid differentiation and PMA-induced megakaryocytic differentiation from K562 cells of 2,6,24,48 hours, respectively. Integrating TFs, miRNAs and differentially expressed genes associated with erythroid and megakaryocytic differentiation, we established a gene regulatory network of erythroid and megakaryocytic differentiation from several layers and found key genes and signal pathways associated with erythroid and megakaryocytic differentiation that contribute to the studies of alternative splicing, fusion gene and lncRNA involved in erythroid and megakaryocytic differentiation. We also analysed the stability of lincRNAs in K.562 cells, which promote the studies of lincRNA during erythroid and megakaryocytic differentiation. Collectively, these studies offer a panoramic view of the transcriptional landscape during erythroid and megakaryocytic differentiation, and provide evidences for understanding the complex mechanisms of hematopoietic lineage commitment and developing therapeutic strategies for blood disease.(1)A regulatory network of divergently expressed genes bound by lineage-specific transcript factors during erythro-megakaryocytic differentiation of K562 cellsThere were 281 genes (332 probe sets) up-regulation and 402 genes (479 probe sets) down-regulation by the analysis of microarray expression profiling of hemin-induced erythroid differentiation in K562 cells (GSE1036, Human Genome HU133A oligonucleotide array GeneChip). Likewise,683 genes (755 probe sets) showed up-regulation and 735 genes (810 probe sets) down-regulation during megakaryocytic differentiation of K562 cells in response to PMA (GSE12736, Illumina HumanRef-8 Expression BeadChip)88 divergently expressed genes were identified by comparing two time-series microarray expression profiles of K562 cells during hemin-induced erythroid differentiation and PMA-induced megakaryocytic differentiation. Furthermore, a regulatory network of divergently expressed genes bound by lineage-specific transcript factors (GATA-1, GATA-2 and PU.1) from ChlP-Seq datasets was established. These results showed that, in this network, seven hub genes (SPI1, GATA-2, GATA-1, ID2, JUN, MYC and EGR-1) extracted from erythroid differentiation and a unit of five integrated genes (ID2, MYC, PIM1, STAT5B and SAC3D1) extracted from megakaryocytic differentiation are potentially associated with erythro-megakaryocytic differentiation, especially ID2 and MYC. These studies identified new links and hub genes associated with divergently expressed genes responsible for erythro-megakaryocytic differentiation.(2) A regulatory network based on combinatorial interactions of transcript factors and miRNAs during erythro-megakaryocytic differentiation of K562 cellsWe respectively sequenced 9 RNA-Seq datasets at the different stages (2,6,24 and 48 hours treated K562 cells by PMA or hemin as well as untreated K562 cells) and reconstructed their transcriptomes after quality control. Differentially expressed genes were identified over two-fold induction to compare untreated K562 cells at least at one time point (|log2 fold change]≥1) using DESeq2.4,216 differentially expressed genes were identified during megakaryocytic differentiation,234 of which appeared at all four treated stages, but only 1,418 differentially expressed genes were identified by microarray. While 1,826 differentially expressed genes were identified during erythroid differentiation, a majority of which appeared at 24 hour (1,506) and 48 hour (1,016) treated stages, but only 683 differentially expressed genes were identified by microarray. Our analysis revealed that the process of PMA-induced megakaryocytic differentiation is more drastic than hemin-induced erythroid differentiation in K562 cells in that 1,959 differentially expressed genes appeared at 2 hours during megakaryocytic differentiation, whereas only 32 differentially expressed genes appeared until 6 hours during erythroid differentiation. Furthermore,2,390 differentially expressed genes during erythroid differentiation is less than megakaryocytic differentiation.Comparing megakaryocytic and erythroid differentiation, there were 195 divergently expressed genes,113 of which showed up-regulation after hemin induction but down-regulation after PMA induction, and 82 showed down-regulation after hemin induction but up-regulation after PMA induction. Gene function enrichment analysis (P value< 0.01) revealed that divergently expressed genes, including GATA-2 and EGR-1, were strikingly enriched in notch signaling pathway and cell differentiation. Differentially expressed genes were classified into 8 distinct gene expression clusters during erythroid differentiation by K-means clustering method, cluster 1 (primarily consisting of down-regulated genes) and cluster 4 (mainly containing progressively up-regulated genes) of which were associated with cell differentiation. Likewise, differentially expressed genes were classified into 10 distinct gene expression clusters during megakaryocytic differentiation by K-means clustering method, cluster 3 (predominantly up-regulated genes) and cluster 8 (entirely consisting of down-regulated genes) of which were associated with cell differentiation.Six key TFs (GATA-1, GATA-2, EGR1, MYC, JUN and FOS) are involved in megakaryocytic and erythroid differentiation. We attained their genome-wide binding profiles in K562 cells based on massive parallel sequencing of chromatin immuno-precipitates (ChlP-Seq) from SRA database, and identified 4,863 differentially expressed genes including 4,623 EGR1,2,794 FOS,1,724 GATA1, 3,477 GATA2,3,403 JUN and 3,215 MYC target genes. We also analysed the distribution of target genes for six TFs in cluster 1 and 4 during erythroid differentiation and in cluster 3 and 8 during megakaryocytic differentiation as well as in divergently expressed genes. In general combinatorial interactions of multi-TF are favored and vital in transcriptional control of their target genes.848 differentially expressed genes were found with overlapping binding of all six TFs,40 of which were divergently expressed genes.14 well-studied miRNAs associated with erythroid differentiation and 15 well-studied miRNAs associated with megakaryocytic differentiation were attained from previous reports. To obtain highly reliable target genes, for each miRNA, only those target genes predicted by two or more tools (TargetScan, miRanda and PicTar2) were remained in downstream analysis. Overall,3,846 target genes were bound by miRNAs implicated in erythroid differentiation,1,148 of which were differentially expressed genes. On the other hand,5,555 target genes were bound by miRNAs associated with megakaryocytic differentiation,1,670 of which were differentially expressed genes. We studied the number and distribution of target genes for top 5 miRNAs associated with erythroid and megakaryocytic differentiation in cluster 1 and 4 during erythroid differentiation and in cluster 3 and 8 during megakaryocytic differentiation as well as in divergently expressed genes, respectively. Generally, collective bindings of multi-miRNA are favored to mediate their target genes.243 differentially expressed genes were found by overlapping binding at least three miRNAs implicated in erythroid differentiation (including 10 divergently expressed genes),56 of which were also bound by all six TFs. Simultaneously,432 differentially expressed genes were found by overlapping binding at least three miRNAs associated with megakaryocytic differentiation (including 16 divergently expressed genes),102 of which were also bound by all six TFs.A gene regulatory network based on differentially expressed genes by combinatorial bindings of TFs and miRNAs implicated in erythroid and megakaryocytic differentiation was established during erythro-megakaryocytic differentiation of K562 cells. We analysed this network from several aspects including gene function enrichment analysis, hotspot detection, extracting sub-networks and cluster analysis, and found that JUN was kernel node of the dense gene regulatory network and TGF-beta/Smad and Ras/ERK pathway were involved in erythroid and megakaryocytic differentiation. Overall, this study provides an integrated view of protein-coding genes in K562 cells during erythro-megakaryocytic differentiation and new insights into multiple layers of the transcriptional regulatory hierarchy that controls lineage commitment.(3) The analysis of the stability of long intergenic non-coding RNAs in K562 cellsThe stability of long intergenic non-coding RNAs (lincRNAs) that possess tissue/cell-specific expression, might be closely related to their physiological functions. However, the mechanism associated with stability of lincRNA remains elusive. We try to study the stability of lincRNA in K562 cells, an important model cell, through comparing two K562 transcriptomes which are obtained from ENCODE Consortium and our sequenced RNA-Seq dataset (PH) respectively. In order to attain high-confidence lincRNA catalog, we improved lincRNA predicting pipeline by filtering coding potential transcripts using iseeRNA, CPAT, CPC and PhyloCSF in turn. By lincRNAs analysis pipeline,1804 high-confidence lincRNAs involving 1564 annotated lincRNAs and 240 putative novel lincRNAs were identified in PH, and 1587 high-confidence lincRNAs including 1429 annotated lincRNAs and 158 putative novel lincRNAs in ENCODE. There are 1009 unique lincRNAs in PH,792 unique lincRNAs were in ENCODE, and 795 overlapping lincRNAs in both datasets. The analysis of differences in minimum free energy distribution and lincRNA half-life showed that a large proportion of overlapping lincRNAs were more stable than the unique lincRNAs.2914 protein-coding RNAs for the intersection of Cuffcompare’s results with four public database annotations (Ensembl, UCSC, GENCODE and Refseq) were identified in PH,2546 (87.4%) of which presented in ENCODE. Likewise,795 lincRNAs (44.1%) in PH appeared in both datasets. Most lincRNAs were more unstable than protein-coding RNAs through comparing their minimum free energy. In addition, pervasively transcriptional lincRNAs of K562 cells, which were transcribed from thousands of locations at every chromosome in human genome, might play widespread roles in gene regulation and other cellular processes. Identification of overlapping and unique lincRNAs can be helpful to classify the stability of lincRNAs. Our results suggest that overlapping lincRNAs (relatively stable linRNAs) and unique lincRNAs (relatively unstable lincRNAs) might be involved in different cellular processes.
Keywords/Search Tags:RNA sequencing technology (RNA-Seq), K562 cells, long non-coding RNAs (lncRNAs), microarray, long intergenic non-coding RNAs (lincRNAs), erythroid differentiation, megakaryocytic differentiation, gene regulatory network, transcript factors, miRNAs
PDF Full Text Request
Related items