Font Size: a A A

Novel Methods For Bioinformatics Applications

Posted on:2007-07-28Degree:MasterType:Thesis
Country:ChinaCandidate:T LiuFull Text:PDF
GTID:2178360182493703Subject:Computer applications
Abstract/Summary:PDF Full Text Request
An accurate salvation model is essential for computer modeling of protein folding and other biomolecular self-assembly processes. Compared to explicit solvent models, implicit solvent models, such as the Poisson-Boltzmann (PB) solver, offer a much faster speed, the most compelling reason for the popularity of these implicit solvent models.Since these implicit solvent models typically use parameters, such as atomic radii and the solvent accessible surface areas, in their calculations, an optimal fit of these parameters is crucial in the final accuracy in salvation free energy, folding free energy, and other properties.In the first half of this paper, we proposed a combined approach, namely SD/GA, which takes the advantages of both local optimization with the steepest descent (SD), and global optimization with the genetic algorithm (GA), for parameters optimization in multi-dimensional space. The SD/GA method is then applied to the optimization of solvation parameters in the non-polar cavity term of the PB model. The results show that the newly optimized parameters from SD/GA not only increase the accuracy in the solvation free energies for ~200 organic molecules, but also significantly improve the free energy landscape of a β-hairpin folding.The current SD/GA method can be readily applied to other multi-dimensional parameter space optimization as well.Protein domain plays an important role in protein science fields. Domain is considered as the fundamental unit of protein structure, folding, evolution, and function. It can fold independently or semi-independently into a stable and compact structure and exhibits a rich evolutionary history and a specialized molecular function. A protein may be comprised of a single or several domains, which are not necessarily contiguous.Identification of protein domain boundary is very important in protein study, and it is still one of the most challenging problems remaining in protein science fields. A large number of methods have been developed to detect the domain boundary or domain linker.In the second half of this paper we offer a new approach using SVM to predict protein boundary from sequence information alone. SVM tried several different descriptors of amino acids including Position Index, Linker Index, Secondary Structure, Solvent Acc, Entropy andHydrophobicity and some of their combinations. Training on a dataset of 238 two-domain proteins from SCOP and CATH, SVM achieves 65% 10-fold cross-validation accuracy using descriptors Position Index, Secondary Structure and Solvent Ace.We compared SVM result with other's and this result is much better than many existed methods. At the same time , SVM is much more stable and fast as a prediction machine.
Keywords/Search Tags:global optimization, solvation parameters, genetic algorithm, Poisson-Boltzmann model, protein domain prediction, SVM
PDF Full Text Request
Related items