Font Size: a A A

The Effects Of Histone Modifications And Transcription Factors On Gene Expression In H1,GM12878 And K562 Cells

Posted on:2019-03-02Degree:DoctorType:Dissertation
Country:ChinaCandidate:L Q ZhangFull Text:PDF
GTID:1360330596956128Subject:Biology
Abstract/Summary:PDF Full Text Request
With the rapid development of high-throughput sequencing technology and the success of the human epigenome project,a large number of histone modifications(HMs)data and transcription factors(TFs)binding data are obtained.However,the theoretical analyses for these massive data can not keep up with the increase of experimental data.Therefore,the analysis of these data,which depends on the bioinformatic methods,has become a hot topic.In addition,studies have shown that HMs and TFs play indispensable roles in gene expression.Based on these theories,using the publicly available data in human embryonic stem cell(H1),B-lymphoblastoid cell(GM12878)and erythrocytic leukemia cell(K562),we systematically study the relations of HMs and TFs to genes expression.The main conclusions are listed as follows:1.We investigate the distributions of HMs and TFs in the DNA regions flanking the transcription start sites(TSSs)of all RefSeq genes and the effects of their distributions on gene expression.It is found that the effects of HMs and TFs on genes expression are dependent on their binding signals.The DNA regions with more HMs and TFs averaged-binding signals have stronger correlations with gene expression levels.Besides,in the H1 cell,we study the distributions of 57 TFs in the DNA regions flanking the TSSs of highly and lowly expressed genes,the results show that 54 TFs activate gene expression,one TF inhibits gene expression and the remaining two TFs perform different functions according to their target genes2.We define the transcription factors synthetic indexes(TFSIs)by using the principal component analysis.Based on these TFSIs,a new theoretical model is constructed for predicting gene expression levels and better predictive results are achieved.Meanwhile,when this model is used to predict the expression levels of genes with high or low CpG content promoters,we find some important TFSIs.3.Combined with gaussian kernel density function,we define the TF association strength(TFAS).And we apply these TFASs to predict the target genes of TFs,better theoretical results are obtained.The results show that not only the genes confirmed by previous experiments are included in our results but also some new target genes are found.4.In H1,GM12878 and K562 cells,based on multivariate linear regression(MLR),support vector regression(SVR)and random forest(RF)regression models,we estimate the predictive abilities of TFs and HMs for gene expression levels.The results indicate that TFs and HMs get similar predictive effects.We then identify the target genes regulated by TFs and HMs by employing the BETA software.It is found that the numbers of the co-regulated genes by HMs and TFs with stronger correlations are more than the numbers of the regulated genes,respectively,by HMs or TFs.The conclusion confirms that the effects of TFs and HMs on gene expression are similar,and the similar effects lead to that there is a similarity between TFs and HMs in the prediction of gene expression.5.We check the contribution each HM or TF to gene expression levels in H1,GM12878 and K562 cells.The results show that POLR2 A and H3K36me3 play important roles in gene expression.To verify the reliability of this conclusion,we construct the interaction network among HMs,TFs and gene expression data.Meanwhile,in each of the 80 bins,we quantify the contribution score of each HM to gene expression levels.The interaction network displays that H3K36me3 and POLR2 A promote gene expression in directly acting manners,and H3K36me3 generally obtains higher contribution scores across the 80 bins.6.In the human chronic myelogenous leukemia(CML),we systematically analyze the effects of gene-body H3K36me3 levels on the degree of gene expression changes.The results exhibit that genes with larger gene expression changes during the tumorigenesis have lower H3K36me3 levels in gene body regions,while genes with higher gene-body H3K36me3 levels tend to yield fewer changes in gene expression levels.Meanwhile,we also validate this finding in human lung cancer,breast cancer and mouse CML.Further studies suggest that genes enriched by lower gene-body H3K36me3 levels participate in many cancer-related biological processes,such as sustaining proliferative signaling,immune destruction,tumor-promoting inflammation,resisting cell death,inducing angiogenesis and deregulating cellular energetics.Moreover,by combining with previous theoretical prediction algorithms,we identify five important driving genes related to CML,including WT1,DNMT3 A,CACNA1E,PHACTR1 and GBP4.
Keywords/Search Tags:human embryonic stem cell, B-lymphoblastoid cell, erythrocytic leukemia cell, histone modification, transcription factor, gene expression
PDF Full Text Request
Related items