Font Size: a A A

The Research Of Identifying Tanscription Factor Binding Sites Based On The Z_Curve Theory

Posted on:2009-09-13Degree:MasterType:Thesis
Country:ChinaCandidate:Y CuiFull Text:PDF
GTID:2120360245954056Subject:Computer application technology
Abstract/Summary:PDF Full Text Request
Along with the development of human genome project (HGP), the bioinformatics has been in the later genome age. Studying the coding area of the gene is one of hotspot problems, and the study of the transcription factor binding site(sTFBS) is one primary aspect. Presently, people have open out many software and arithmetic in identifying and predicting the TFBS, such as MatInspector, MEM, AlignACE, Gibbs sampler and so on. With the progress of all kinds of technology, the study of the TFBS makes great progress, but many facts of this field still have some problems and are incomplete, simultaneity, more important the study can accelerate the development of the transcriptional regulated mechanism. Therefore, recognizing the TFBS has been one question for more important discussions in the domain of bioinformatics.The paper introduces the theory of Z_Curve in studying TFBS, and brings forward a new model to describe the peculiarity of TFBS based on the Z_Curve, namely Z_Curve Coordination Matrix (ZCCM) model. ZCCM model is one curve called the center curve essentially that can completely describe the characteristic and information of TFBS, i.e. a coordination matrix of the center curve. In the arithmetic, we compute the similarity distance vectors between the Z_Curve of a sequence curve and the center curve, and make the similarity distance vectors as the comparability characters to train the BP network and get the classify results. In addition, we use our method to identify the TFBS in E. coli, and obtain preferable results. Moreover, we compare our method with the MatInspector, as a result, it illuminates that our new model takes on favorable capability, and can exactly depict the peculiarity of the TFBS and it takes on well stability and veracity, and it has less dependence on the data of DNA sequences to train the BP network. In a word, our new model ZCCM is more complete and more veracious in describing the TFBS.In a word, the paper puts forward a new model base on Z_Curve theory called ZCCM to descript the TFBS, and shows the process and steps of building the new model. Further more, the experiment proves the advantage in the model and our method in identifying TFBS in E. coli is very laconic, efficient and veracious, and the new model ZCCM has a specified significance in the abstract and in practice for studying the TFBS.
Keywords/Search Tags:Transcription Factor Binding Sites, Z_Cuve Theory, Euclid Distance, BP Neural Network
PDF Full Text Request
Related items