Severe acute respiratory infection (SARI) is one of the great causes of child hospitalization and death, and most of the children’s respiratory tract infections result from viruses. There are a variety of viruses associated with respiratory infections which always mutated with high evolutionary rate. Conventional laboratory testing methods, including molecular biotechnology and serological detection, can only be used for the detection of a single or a few viruses, which is not comprehensive and can’t fully reflect the infection status of patients.High-throughput sequencing (HTS) technology has become a powerful tool in pathogen detection compared with traditional methods, because it allows for the detection of all the pathogens in the respiratory tract without any advance genetic information. Comprehensive and detailed metagenomic analyses of respiratory tracts samples from children with SARI and matched control groups have not yet been reported in China. Therefore, we conducted a viral metagenomic analysis and attempted to provide a complete picture of the viral content and diversity in the respiratory tracts of children with SARI. Firstly we established the respiratory sample processing and analysis platform, groped for the nucleic acid amplification method, and then made viral metagenomic analysis of respiratory tracts of SARI children and matched control groups, determining the main pathogenic agents for children’s SARI in Beijing. Finally, using next-generation sequencing, the full-length genomic sequences of one human adenovirus strain and four human bocavirus strains were determined, with some new genomic characterization. The main results and conclusions were summarized as follows:1. Optimization of respiratory samples pretreatment and nucleic acid amplification.Isolated HCoV-NL63 strain was used for the comparative analysis of sample pretreatment and untreated group. Data analysis showed treatment group always takes much higher share of the reads related to viruses and has higher genome coverage rate. Through the comparisons of nucleic acid amplification method between sequence-independent single primer amplification (SISPA) and multiple displacement amplification (MDA), results indicate that SISPA was suitable for general respiratory sample amplification, and MDA was fit for micro-sample amplification. Whole genome sequencing method was conducted and the data analysis showed that specific random-primers method performed much better in the complete genome sequencing of HCoV-NL63 rather than HCoV-OC43. Through segmented amplification method, most of the genome sequences of HCoV-NL63 could be obtained with partial fragments amplification failure.2. Comparative analysis of viral genetic diversity in respiratory samples from SARI children and control groupIn this study, nasopharyngeal swabs from children with and without SARI (135 vs.15) were collected in China and subjected to multiplex metagenomic analyses using a next-generation sequencing platform. The results show that members of the Paramyxoviridae, Coronaviridae, Parvoviridae, Orthomyxoviridae, Picornaviridae, Anelloviridae, and Adenoviridae families represented the most abundant species identified in the respiratory tracts of children with SARI. The viral population found in the respiratory tracts of children without SARI was less diverse and mainly dominated by the Anelloviridae family with only a small proportion of common epidemic respiratory viruses. According to comparative analysis, the viruses with over 50% coverage rate, including HRSV, HCoVs (HCoV-OC43,229E and HKU1), HBoV, HPIVs, influenza viruses A and HAdV, may be the main causes for SARI in Beijing’s children. As to the viruses related to Anelloviridae, there was little difference between children with SARI and control group, indicating that there might be no direct link between acute respiratory infection and TTVs.3. Whole-genome sequencing of common respiratory virusThrough the deep sequencing platform, we successfully conducted whole-genome sequencing and sequence analysis of common respiratory viruses, including one HAdV strain (CBJ113 strain) and four HBoV strains. BootScan and single nucleotide polymorphism analyses showed that the CBJ113 genome has a complex intra-subtype recombinant structure and comprises gene regions originating from two circulating viral strains:HAdV-1, HAdV-2. Analyses of HBoV1 evolutionary rates showed that NS1 exhibited the highest degree of conservation, and the VP1 gene exhibited the fastest rate of evolution at 4.20×10-4 substitutions/site/year. The most common sites for nucleotide deletions and substitutions almost occurred in NP1 and VP1 and represented novel molecular signatures enabling differentiation between HBoVs. There was no significant difference among the nucleotide diversity of HBoVl from different regions and the nucleotide sequences diversity of VP1 gene was greater than others. In addition, the high degree of nucleotide sequence diversity in HBoV2 is greater than that of HBoV1.In this study, we conducted comparative analysis of viral genetic diversity in respiratory samples from SARI children and control group and determined the main pathogenic agents of children’s SARI in Beijing. In addition, by deep sequencing platform, we obtained the complete genomes of common respiratory viruses including HAdV and HBoV. These studies will provide a new method for the diagnosis of the respiratory disease and a novel good platform for further study of viral evolution and mutation. |