Font Size: a A A

Research On Identification Of Chromatin State Based On ChIP-Seq Data

Posted on:2020-12-31Degree:MasterType:Thesis
Country:ChinaCandidate:L XuFull Text:PDF
GTID:2370330590974434Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
With the successful implementation and completion of the Human Genome Project,the ENCODE Program,and the Thousand Genomes Project,the focus of genomics research has shifted from revealing the genetic information and passwords of living organisms to functional studies of the overall level of molecules.The rapid development of high-throughput sequencing technology has strongly promoted the study of functional genomics.It is gradually recognized that chromatin is a dynamic genomic organizer that directs DNA activity.The use of a combination of chromatin histone modification signals to annotate chromatin status is the primary method for finding specific patterns of activity between regulatory regions and cell types and for explaining the relationship between diseases.Long-distance chromatin contact between specific DNA regulatory elements plays a key role in gene expression regulation.Global characterization of the interactions in these three-dimensional(3D)chromatin structures is essential in understanding signal networks and cell states.Chromatin interaction analysis was performed using paired end tag sequences(ChIAPET)to regulate gene expression and further affect the activity of other cells.In this paper,histone modification data of several cell lines(GM12878,K562,MCF7,Hela-s3)were selected for data analysis and enrichment level study.The method of feature selection is used to represent the features,and a variety of unsupervised clustering algorithms are combined to recognize the chromatin status of multiple cell lines.Experiments show that the combination algorithm based on feature representation and clustering has better realization,which shows that this idea is reasonable and effective.Only the ChIA-Seq data can not predict the interaction of three-dimensional chromatin structure.Therefore,the interaction of the threedimensional chromatin structure is added by adding the ChIA-Pet data,and then the two are combined to form a reasonable data set to predict the supervised learning method.The method combines sequence features and appearance features.Word2 Vec language model is used to process sequence features.The appearance features include histone modification and DNase-Seq signal.The experimental results show that the model effectively recognizes the interaction of long-distance chromatin and improves the three-dimensional state of chromatin state.
Keywords/Search Tags:chromatin state, histone modification, feature representation, ChIA-Pet, ChIP-Seq, Word2Vec
PDF Full Text Request
Related items