Font Size: a A A

Research On The Identification,Classification And Evolutionary Variation Of Bacteria Based On Whole Genome

Posted on:2021-03-02Degree:MasterType:Thesis
Country:ChinaCandidate:J L ZhouFull Text:PDF
GTID:2370330614970435Subject:Bioinformatics
Abstract/Summary:PDF Full Text Request
As one of the principal groups of organisms on the earth and the most numerous among all living things,bacteria live in a broad range of environments with a wide variety of species.They have a very close,but complex,relationship with humans.With the expansion of human life and the change of production and lifestyle,the infectious diseases caused by bacteria have emerged constantly and become increasingly prevalent worldwide,seriously threatening human health and public health safety.Rapid identification and comprehensive analysis of bacteria,especially outbreak strains,are really necessary.The bacterial genome contains all of its genetic information.Consequently,the determination of the genome sequence is the basis and premise for understanding its biological and functional characteristics.Whole-genome-based methods provide the highest resolution for bacterial identification and comprehensive analysis now.With the continued improvement in high-throughput sequencing(HTS)technology and the decline in the cost of sequencing,there are more and more bacterial whole genomes in public databases,which provides a good data fundamental for researchers.The main content of this study is to carry out a series of bacterial identification,classification,and evolutionary variation analysis based on whole bacterial genome data from public databases.Firstly,the concept of "species-specific k-mers" was proposed,and a database of bacterial species-specific k-mers was established.From the perspective of genomic kmers comparison and short reads alignment,we researched and built an assembly-free platform for the comprehensive analysis of bacterial genome evolution and variation,which integrated our method of bacterial identification based on the species-specific kmers database,the method of bacterial horizontal gene transfer(HGT)detection and annotation based on the k-mers backtrace,and the method of important phenotypic gene prediction based on short reads alignment.For a tested bacterium,the platform can directly analyze the raw sequencing reads to quickly obtain the species information,HGT fragments and their CDS annotations,and the distribution of genes such as antibiotic resistance genes,virulence factors,and protective antigens.The accuracy and reliability of the proposed methods are verified by several real and simulated sequencing datasets.Secondly,the taxonomy of Burkholderia cepacia complex(BCC)was systematically analyzed based on all available whole-genome sequences.We compared the phylogenetic trees of BCC based on 16 S r RNA,rec A,his A,and multilocus sequence analysis(MLSA).It was found that the 16 S r RNA,rec A,his A and MLSA in common use had limited resolution in the taxonomic study of closely related bacteria such as BCC.The species tree and d DDH/ANI clustering could clearly divide the BCC strains into 36 groups.With an appropriate reclassification of previously misidentified strains,these groups correspond to 22 known BCC species and 14 putative new species.In addition,the results of this work suggest that the combined utilization of the phylogenetic tree based on single-copy orthologous genes and the pangenome-based d DDH/ANI clustering can provide a better framework for delineating bacteria species,especially for closely related bacteriaThird,the pan-genomic characteristics and the evolutionary dynamics of the core genes of the BCC were analyzed in this study.The results suggest the pan-genome of the BCC is large and divergent,with a certain degree of openness.5.77% of core genome genes of the 1005 orthogroups consisting of entirely of the single-copy genes had significant homologous recombination signals.It is important that recombination between species is more common than within species of BCC.The high level occurring recombination between species may be the key force to maintain the wide genetic cohesion in BCC and enhance the high similarity of species within BCC.Positive Selection analyses showed that the number of positive selection genes in the core genome of BCC was relatively small,and all eleven genes under positive selection were mainly involved in protein synthesis,material transport and metabolism.Besides,ten of eleven positively selected genes were identified to have signatures of homologous recombination by at least one test,further confirming that homologous recombination may be a mechanism for maintaining the genetic cohesion of BCC.These adaptive variations in the genes under positive selection pressure were possibly related to dynamic interactions between changing environmental conditions and the immune system in the host.These positively selected genes might serve as the targets to further investigate the adaptive evolution mechanisms as well as the host-pathogen interactions within Bcc.Recent development of microbial whole genome sequencing has brought promising prospects to enhance diagnostics and public health microbiology.The whole genome sequencing-based approaches will play a greater and more important role in laboratory diagnosis,nosocomial infection investigation,and infectious disease prevention and control in the future.The assembly-free bacterial genome evolution and variation analysis platform established in this study can rapidly and comprehensively analyze the tested bacteria,reporting their biological characteristics.It would provide reliable information supports for clinical treatment plan selection and infectious disease prevention and control.Our genome-wide analyses reconstructed the taxonomy of BCC and elucidated the characteristics of its pan-genome and the evolutionary events of the core genomes of BCC,which is helpful to understand the reasons for the confusing taxonomic status and difficulty of identification of bacteria in BCC.
Keywords/Search Tags:bacterial whole genome, identification, classification, horizontal gene transfer, evolutionary dynamics
PDF Full Text Request
Related items