Font Size: a A A

Studies On The Molecular Evolution Of SARS Coronavirus

Posted on:2008-02-11Degree:DoctorType:Dissertation
Country:ChinaCandidate:Y ZhangFull Text:PDF
GTID:1100360242472957Subject:Bioinformatics
Abstract/Summary:PDF Full Text Request
Severe Acute Respiratory Syndrome (SARS) coronavirus is an important novel pathogen. The evolutionary research on SARS coronavirus is of significance for the control and prevention of this serious infectious disease. In this thesis, we conduct explorations in the following fields: reconstructing the genomic phylogeny of SARS coronavirus isolates, detecting the adaptive evolution of S protein, and assessing the functional effects of structural variations on the receptor-binding domain.To explore the evolutionary history of SARS coronavirus genome, we construct the largest irredundant SARS coronavirus complete genome sequence alignment up to date. Based on this data set, we add all the SARS-like coronavirus isolates complete genome sequence for construction of the second genomic data matrix.To detect the potential recombination events among SARS coronavirus isolates, we perform automatic scanning on the two genomic data sets using five algorithms. The analyses consistently indicate the absence of recombination among SARS coronavirus isolates or between SARS coronavirus and SARS-like coronavirus. In other words, this finding lays a solid foundation for the the whole-genome phylogenetic analysis, of SARS coronavirus isolates.To reconstruct the genomic phylogeny, we conduct neihbor-joining and bayesian analyses on the two data sets mentioned above. The results consistently suggest that the monophyly of SARS coronavirus, the ingroup, is strongly supported with the SARS-like coronavirus used as the outgroup. Among all the SARS coronavirus isolates, three human isolates (ZSA, ZSB and ZSC) obtained in the early phase of SARS emergence is the most likely basal group. Hence, they are significant for understanding the molecular mechanisms of SARS coronavirus crossing the species barrier.To estimate the genomic substitution rate, we select five representative viral isolates with accurate sampling date from the genome phylogenetic tree. A likelihood ration test of molecular clock hypothesis is performed on the alignment data set of those five isolates. Because the analysis suggests the presence of molecular clock, a linearized tree is reconstructed. The genomic substitution rate of SARS coronavirus is estimated to be 6.01105×10-6 per site per day. The estimation of divergence time implies the emergence of SARS coronavirus in human population before 3 November 2002, being consistent with the onset date of 16 November 2002 from epidemiological investigations.To detect the site-specific adaptive evolution of SARS coronavirus S gene, we construct the largest irredundant SARS coronavirus complete coding sequence alignment up to date. Based on this data set, we select some representative viral isolates to construct a small data set. By use of the DATAMONKEY server, the single likelihood ancestor counting analysis is conducted on the large data set. And it finds one positively selected sites and negatively selected sites on the level of P=0. 1. The single likelihood ancestor counting, fixed effects likelihood, and random effects likelihood analyses are conducted on the small data set and find one, nine, and eighteen negatively selected sites on the level of P=0.1 or BF=10, respectively. The functional implications of those sites under natural selection are also discussed.To detect the lineage-specific adaptive evolution of SARS coronavirus S gene, we construct two data sets reflecting the small and large evolutionary scales, respectively. By use of the genetic algorithm implemented in the DATAMONKEY server, we detect the branch-specific adaptive evolution on the basis of the two data sets. And the results consistently indicate that the S gene underwent positive seletion during different transmission phases or in different host species population.To explore the functional effects of structural variations on the receptor-binding domain (RBD) of S protein, we conduct structural bioinformatics analyses on the structures of RBD in complex with cellular receptor or neutrilizing antibody. The simulation of interactions between RBD mutants and host species-specific receptors finds the consistency between computational results and experimental evidences. For the research and development of antiviral inhibitors, We also predict some amino acid replacements that probably cause the significant increase in the affinities of complexes. In addition, we predict the tertiary structure of bat SARS-like coronavirus RBD by the technology of homology modeling. The important structural variations between the two RBDs, revealed by the analysis, lead to the inference that SARS-like coronavirus fails to infect human.Besides, we discuss the future works that include the construction of SARS coronavirus evolutionary analyses server, the simulation study on the evolution of SARS coronavirus, the detection of adaptive evolution for other genes, and the evolutionary analyses on the coronavirus species genomes.
Keywords/Search Tags:SARS Coronavirus, Genome, Bioinformatics, Molecular Evolution, Adaptive Evolution
PDF Full Text Request
Related items