Font Size: a A A

Identification Dynamic Regions Among Multiple Cell Types By Using Epigenetic Sequencing

Posted on:2020-06-03Degree:MasterType:Thesis
Country:ChinaCandidate:J H XieFull Text:PDF
GTID:2370330611954739Subject:Bioinformatics
Abstract/Summary:PDF Full Text Request
Epigenetics is the study of heritable phenotype changes that do not involve alterations in the DNA sequence.Epigenome is the complete description of all the chemical modifications to DNA and histone proteins that regulate the expression of genes within the genome,including various histone modifications,DNA methylation,nucleosome distribution,and non-coding RNA effect.The nucleosome is the basic structural unit of chromatin,and its position on the genome can regulate the biological activities such as transcription and replication of genomic DNA,because the nucleosome occupying can block the protein-binding site on the DNA.The location of nucleosomes is highly variable in different cell lines,so we are not only concerned with the location of nucleosomes but also the changes in nucleosomes among individual samples and cell types.However,the current tools for identifying these regions of variation only support two samples,which is highly disadvantageous in the study of queue studies involving multiple samples.Therefore,we developed the DNMHMM,a tool based on the hidden Markov model with hypothesis testing to determine the dynamic region of the nucleosome,which can be used to determine the dynamic region of the nucleosome among multiple samples(n>=2).With the model,in yeast mutants where the modifiable histone residues were mutated into alanine,we found that DNA sequences of the dynamic nucleosomes lack 10-11 bp periodicities and harbor the motifs of the nucleosome remodeling complex BAF1 and CBF1.During the activation of human CD4 + T cells,the dynamic nucleosomes are around regulatory sites and partly account for changes of gene expression.Considering that the nucleosome and chromatin opening regions and other sequencing have the same information form(read coverage),the tool will also have a good effect in different epigenome sequencing.Histone modification,DNA methylation is associated with the open or compact state of local chromatin.The accessibility of chromatin affects the ability of regulatory factors binding to DNA,and is an indicator of the activation status of various regulatory elements on DNA.ATAC-Seq(Assay for Transposase-Accessible Chromatin and Sequencing)is an important method for detecting chromatin accessible regions.Since it was proposed,ATAC-Seq has been used more and more frequently to detect chromatin open regions for its simple experimental procedure and the low cell required.With ATAC-Seq,the state of each regulatory element(enhancer,insulator,etc.)on the DNA strand can be analyzed,and its effect on transcriptional expression is further speculated.A large number of studies have shown that the dynamic changes of chromatin opening regions are closely related to the growth and the evolution of cancer.With the development and maturity of sequencing technology,more and more ATAC-Seq data has been generated,however,there is no public database design to collect and analyze these data specially.In order to analyze chromatin accessibility information across different data sets,it is necessary to establish a ATAC-Seq database.This paper give an introduction of an ATAC-Seq database(ATACMAP)we developed,whose structure includes front-end analysis,visualization,comparison modules,and back-end management modules.We collect and manage ATAC-Seq data based on MySQL and use the genome browser,JBrowse,to visualize the data.We implemented an automated process for obtaining data from GEO and processing the data.The user can view all the records in database,and can quickly find the chromatin opening regions from ATAC-Seq of different cell types(samples),and compare and extract further biological information.All functions can be realized online,allowing users and administrators to view and manage data easily through a browser.This database provides a data source and platform basis for further data mining in ATAC-Seq.In short,the database,ATACMAP,provides a data source for differential recognition of epigenetic sequencing information between multiple cell types,and DNMHMM provides a reliable algorithm for identifying differential regions among multiple samples.
Keywords/Search Tags:ATAC-Seq, database, dynamic nucleosome, identify dynamic regions among multi-sample, chromatin accessibility
PDF Full Text Request
Related items