Font Size: a A A

Analysis Of Human And Animal Viral Metagenomes And Whole Genomes Using High-throughput Sequencing And Bioinformatics

Posted on:2020-05-10Degree:MasterType:Thesis
Country:ChinaCandidate:Y Y WangFull Text:PDF
GTID:2370330575987785Subject:Microbiology
Abstract/Summary:PDF Full Text Request
Viruses are ubiquitous in nature.They do not have a cellular structure,but have life characteristics such as inheritance and replication.It is necessary to use host cells to complete life activities.The effects of diseases caused by viruses on humans,animals and plants are enormous.Virus-derived diseases such as SARS affecting the world in2003,Zika in 2013,Ebola in West Africa in 2014,Middle East Respiratory Syndrome(MERS),etc.,caused major public health events of varying degrees.The purpose of monitoring the emergence and recurrence of viral diseases is to curb the spread of the viruses,which requires both adequate preparation and rapid detection.Identifying the causative agents of a new outbreak is one of the most important measures to effectively deal with a disease outbreak.Traditionally,virus discovery required the virus to multiply in cell culture.However,many viruses can't easily multiply in cell culture,and the virus can't be isolated,which limits our understanding of viruses and in-depth virus information mining.This study uses high-throughput sequencing technology combined with bioinformatics to analyze human and animal virus metagenomes and whole genomes.For clinical samples of unexplained fever patients or pathogens that cannot be detected by conventional methods,high-throughput sequencing technology can be used for pathogen screening.Particular pre-treatments of the samples,such as enrichment of pathogens and removal of host nucleic acids,have limited enriching efficiency and can not completely remove host nucleic acids,but can reduce the interference of non-target microbes and improve the pathogen detection effect.With the rapid development ofscience and technology,high-throughput sequencing technology is characterized by low cost,high speed,high quality and high throughput.Bioinformatics analysis of large volume sequencing data especially the whole genome assembly is time-consuming and laborious,which has become a huge challenge.In this paper,a high-throughput sequencing analysis pipeline independently developed in our laboratory is used to perform bioinformatics analysis on known and unknown viruses.The raw data is first filtered to remove low-quality sequences and short sequences,and then the sequence coverage is normalized.The purpose is to reduce the amount of data for the subsequent analysis to reduce the analysis time.The normalized reads is first quick-matched with the NR viral protein library,and all tentative viral sequences were extracted and searched against the NT nucleic acid library to verify the viral sequence.The unmatched sequence were then extracted to re-do blastp analysis against the viral protein library so as to get the clue for novel viruses.The results are classified at the level of taxonmic order,family,genus,and species,and the sequence coverage and similarity were displayed for manual checking.Using the above analysis techniques,the following researches were carried out: 1.Adenovirus type 41 was identified by high-throughput sequencing screening of feces of an unexplained diarrhea patient,which provided pathogen information for clinical treatment.The adenovirus type 41 primers were synthesized to amplify the whole genome of the target virus,and the whole genome sequence of the virus was obtained by secondary high-throughput sequencing.The phylogenetic tree was constructed owith the whole genome of the virus,and the recombinant analysis and mutation analysis were performed.The sequence difference between the identified virus and related virus and the epidemic situation is analyzed to provide guidance for epidemic prevention and disease treatment.2.In response to the outbreak of hemorrhagic fever with renal syndrome in Shaanxi Province in 2017,clinical samples were collected forhigh-throughput sequencing to confirm Hantavirus infection.Viral whole genome amplification primers were designed,nested PCR amplification was performed,and high-throughput sequencing was performed again.The whole genome sequences of the viruses were obtained,and the virus strains of the epidemic were analyzed by bioinformatics,and traceability analysis was carried out to provide assistance for the prevention and control of the epidemic situation.3.From the virome analysis of mosquitoes in Mengla County,Yunnan Province from May to November 2011,a novel virus was identified.Combined with genome amplification and sequencing,the whole genome sequence of the virus was obtained,and the whole genome sequence of the virus was obtained.The phylogenetic tree was constructed.In this study,a variety of new viruses carried by mosquitoes in Yunnan were also discovered,which promoted the understanding of the diversity of arboviruses.
Keywords/Search Tags:high-throughput sequencing, virome, whole genome, bioinformatics
PDF Full Text Request
Related items