An Algorithm To Detect TFBSs Based On ChIP-seq Data

Posted on:2017-03-26

Degree:Master

Type:Thesis

Country:China

Candidate:Y L Jia

Full Text:PDF

GTID:2180330482487174

Subject:Computer Science and Technology

Abstract/Summary:

PDF Full Text Request

With the tremendous development of bioinformatics, researchers constantly explore the gene regulatory rule. The factor that researchers cannot understand the principle of the transcription regulation prevents researchers from getting insight into the genome-wide regulation network. Today, the pattern recognition of transcription factor binding sites has become one of the hot topics in bioinformatics. As a kind of important transcription regulatory factors, transcription factors regulate the expression of genes downstream in the process of gene expression by binding on the corresponding specific binding site to inhibit or enhance the role of gene. Transcription regulation is a critical step of gene expression. Detection of the specific DNA sequences contributes great to the understanding of gene regulation.With the rapidly development of high throughput sequencing automation technology, chromatin Immunoprecipitation (ChIP) technology combined with the second generation sequencing technology, which form the ChIP-seq technology, provides vast amounts of data for transcription factor binding sites recognition research. ChIP-seq technology offers a new approach to detect transcription factor on whole genome by immunoprecipitation with specific protein. ChIP-seq technology has become a general approach to detect de novo transcription factor binding sites on genome. There are already a lot of algorithms for detecting binding sites on the ChIP-seq data. But these algorithms still have many flaws. Firstly these algorithms cannot handle the huge amounts of data produced by ChIP-seq technology and cause unrealistic costs on time; secondly these algorithms need to filter the repetitive sequence on ChIP-seq data; finally these algorithms cannot detect the true transcription binding sites without a negative reference data. In order to detect transcription factor binding sites more effectively.Here we offer a new algorithm based on ChIP-seq data by using the idea of the expectation maximization (EM) algorithm. Pre-existing algorithms based on EM algorithms lose sight of the trait of ChIP-seq data. The algorithm is evaluated quantitatively on rice ChIP-seq data and mouse ChIP-seq data. The algorithm reports existed transcription factor binding sites and new transcription factor binding sites. Compared with the algorithms here, existed algorithms are more time-consuming. The new algorithm proposed in this paper achieves great in detecting transcription factor binding sites and provide new technical means and important tool for the research of transcription factors.

Keywords/Search Tags:

Transcription factor, transcription factor binding sites, ChIP-seq technology, EM algorithm

PDF Full Text Request

Related items

1	Algorithm Research On The Problem Of Transcription Factor Binding Sites Identification
2	The Research On The Discovery Of Transcription Factor Binding Sites Based On Genetic Algorithm
3	Transcription Factor Binding Sites Prediction Algorithm Study And Application
4	The Distribution Of Transcription Factor Binding Sites In Upstream Regions Of Yeast Genes
5	A Novel Method For Classification Of Transcription Factors Based On Properties Of Transcription Factor Binding Sites
6	Based On The Information Of Sequences To Predict The Transcription Factor Binding Sites And Promoter
7	Research On Transcription Factor Binding Sites Recognition Based On HMM
8	The Evolution Of Transcription Factors Binding Sites Clustered Regions In Eukaryote
9	The Research For Recognition Of Transcription Factor Binding Sites Based On Genetic-Neural Network
10	Genome-wide Analysis Of Transcription Factor Binding Sites And Gene Mutation Of Genetic Disease