| With the tremendous development of bioinformatics, researchers constantly explore the gene regulatory rule. The factor that researchers cannot understand the principle of the transcription regulation prevents researchers from getting insight into the genome-wide regulation network. Today, the pattern recognition of transcription factor binding sites has become one of the hot topics in bioinformatics. As a kind of important transcription regulatory factors, transcription factors regulate the expression of genes downstream in the process of gene expression by binding on the corresponding specific binding site to inhibit or enhance the role of gene. Transcription regulation is a critical step of gene expression. Detection of the specific DNA sequences contributes great to the understanding of gene regulation.With the rapidly development of high throughput sequencing automation technology, chromatin Immunoprecipitation (ChIP) technology combined with the second generation sequencing technology, which form the ChIP-seq technology, provides vast amounts of data for transcription factor binding sites recognition research. ChIP-seq technology offers a new approach to detect transcription factor on whole genome by immunoprecipitation with specific protein. ChIP-seq technology has become a general approach to detect de novo transcription factor binding sites on genome. There are already a lot of algorithms for detecting binding sites on the ChIP-seq data. But these algorithms still have many flaws. Firstly these algorithms cannot handle the huge amounts of data produced by ChIP-seq technology and cause unrealistic costs on time; secondly these algorithms need to filter the repetitive sequence on ChIP-seq data; finally these algorithms cannot detect the true transcription binding sites without a negative reference data. In order to detect transcription factor binding sites more effectively.Here we offer a new algorithm based on ChIP-seq data by using the idea of the expectation maximization (EM) algorithm. Pre-existing algorithms based on EM algorithms lose sight of the trait of ChIP-seq data. The algorithm is evaluated quantitatively on rice ChIP-seq data and mouse ChIP-seq data. The algorithm reports existed transcription factor binding sites and new transcription factor binding sites. Compared with the algorithms here, existed algorithms are more time-consuming. The new algorithm proposed in this paper achieves great in detecting transcription factor binding sites and provide new technical means and important tool for the research of transcription factors. |