Font Size: a A A

Computational Methods for the Discovery and Application of Protein Sequence-Structure-Function Relationships

Posted on:2015-04-03Degree:Ph.DType:Thesis
University:Dartmouth CollegeCandidate:He, LuFull Text:PDF
GTID:2470390020450942Subject:Computer Science
Abstract/Summary:
Proteins are ubiquitous in cells and are essential to a wide range of biological processes. Since existing proteins occupy only a small portion of the space of possible amino acid composition, understanding their sequence-structure-function relationships is important, both revealing how particular amino acid sequences form viable proteins with specific functions as well as providing guidance in designing novel protein variants. This thesis develops new computational methods addressing protein sequence-structure-function relationships from three different directions: optimizing protein engineering experiments that modify sequence in order to improve structure and function, searching for structural motifs in order to help characterize function, and discovering constraints on sequence imposed by the pressure of evading immune response while maintaining function.;First, we develop an efficient optimization algorithm CODNS to extend the scope of DNA shuffling from its intrinsic homology dependent and purely stochastic limitations, enabling experiments to incorporate more diverse parents and generate more predictable, productive, optimized libraries. Second, we design a novel general purpose protein engineering optimization framework PEPFR to produce all Pareto optimal experiment designs, enabling the optimization of multiple competing engineering objectives. Third, we present a simple but general, effective and efficient approach BALLAST to search for instances of structural motifs within large databases of structures, enabling the prediction of functional characteristics of new proteins. Finally, we develop a novel model JIS for assessing the immunogenicity risk of protein antigens by bringing together both sides of T cell mediated recognition of a foreign protein, and apply this model to reveal patterns of pathogen camouflage, enabling the selection and optimization of antigens for vaccine design.
Keywords/Search Tags:Protein, Sequence-structure-function, Optimization, Enabling
Related items