Font Size: a A A

Research On Fast Motif Finding Methods Based On Heuristic Strategies

Posted on:2012-03-24Degree:MasterType:Thesis
Country:ChinaCandidate:S S LvFull Text:PDF
GTID:2178330335450550Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
In the 1950s, the discovery of the DNA double helix structure created the molecular biology of this era. The life science research that put biology and medicine as the main research contents had entered the unprecedented high-speed development stage. With the completion of the human genome sequencing and the revolutionary growth of the relevant information of molecular biology, processing mass biological data was needed. Bioinformatics arises at the historic moment. Although today's bioinformatics has more complex and system research, the understanding of the mechanisms of gene transcription regulation is still an important and challenging problem in bioinformatics. And the committed step to solve this challenging problem is the ability of identifying the basic components which regulate and control the gene transcription:transcription factors binding sites (also called motif).At present, the algorithms for predicting the transcription factor binding sites mainly divides into two major classes:heuristic enumeration methods based on consensus expression and local optimization methods based on position specific weight matrix. The algorithm Weeder which is based on consensus expression and use heuristic search strategy make a better result in all algorithms. Then, we mainly discuss the algorithm for discovering motif based on consensus expression. There is a series of tries for improve the Weeder. The work and contributions are the following:1.The general suffix tree using the N sequences (length L from 600 to 5000) and the space complexity is O(NL), we find that we only use part suffix tree to search. So we give a part suffix tree, and the space complexity is O(Nl).2.We observed that there is too many candidate motifs in the searching process so that a new method for compressing the searching space was brought in. The speed of Weeder was accelerated on condition that no precision was reduced.3.Weederpromote which express excellent in space and time was given based on the heuristic strategies.It discovered the (l, d)motif when the length l of motif was unknown. The experimental results on manual and real datasets reveal that the Weederpromote has superiority than Weeder.
Keywords/Search Tags:Bioinformatics, Transcription factor binding site, Motif
PDF Full Text Request
Related items