Font Size: a A A

Protein structure prediction by linear programming

Posted on:2004-11-28Degree:Ph.DType:Thesis
University:University of Waterloo (Canada)Candidate:Xu, JinboFull Text:PDF
GTID:2460390011462677Subject:Computer Science
Abstract/Summary:
If the primary sequence of a protein is given, what is its three-dimensional structure? This is one of the most important and difficult problems in molecular biology and has tremendous implication to proteomics. Over the last three decades, this issue has been intensely researched. Protein threading represents one of the most promising techniques. So far, there are many protein structure prediction computer programs based on protein threading; however, almost none incorporates the pairwise contact (interaction) potential explicitly in its energy function, although scientists believe that pairwise interactions are important for fold recognition targets. The underlying reason for ignoring the pairwise potential is that the protein threading problem is NP-hard (i.e., it is unlikely to have a polynomial-time algorithm), if the pairwise interactions are treated rigorously.; The key contribution of this dissertation is to show that there is actually a practically efficient algorithm for protein threading, even if the pairwise interactions are treated explicitly and rigorously. In this thesis, we propose a novel integer programming approach to solve the protein threading problem. The key element is a set of strong linear constraints to formulate the problem. In addition, we employ polyhedral combinatorics to analyze our formulation, and confirm that our formulation is well-described for the protein threading problem. Large scale experimental tests demonstrate that the optimal linear solutions of almost all the linear programs that formulate real threading instances solve to integrality, indicating that the protein threading problem is tractable in practice (i.e., it can be solved within polynomial time).; We have implemented the proposed algorithm combined with several other components as a protein structure prediction computer program RAPTOR (RApid Protein Threading predictOR). CAFASP3 (The Third Critical Assessment of Fully Automated Structure Prediction) evaluation, the latest worldwide protein structure prediction competition, ranks RAPTOR first among all the non-meta structure prediction programs. The latest LiveBench evaluation also shows that the performance of RAPTOR is superior compared to other structure prediction tools. These indicate that the integer programming approach is promising in further improving the prediction rate of the protein threading approach.
Keywords/Search Tags:Protein, Prediction, Structure, Linear
Related items