Font Size: a A A

Dynamics Of HIV-1 Quasispecies Diversity Of Participants Under Antiretroviral Therapy Based On Intra-host Single-nucleotide Variations And Next-generation Sequencing

Posted on:2022-04-06Degree:DoctorType:Dissertation
Country:ChinaCandidate:Y Y ZhangFull Text:PDF
GTID:1484306344471564Subject:Pathogen Biology
Abstract/Summary:PDF Full Text Request
BackgroundThe proviral HIV DNA integrates into the genome of the infected cell,which form the HIV reservoir.Latent HIV reservoir is the major obstacle in HIV eradication.After virus suppression,the total HIV DNA level is similar to the virus reservoir and can maintain stable during ART.The results of our team’s previous study indicated that whole blood HIV DNA and plasma HIV RNA have different drug resistance mutational patterns at same sampling time.We still have limited knowledge of drug-resistant associated variations evolution and HIV DNA quasispecies dynamics under long-term ART.It is necessary to establish a reliable detection and analysis process and to carry out evolutionary research in different HIV cohorts.The HIV evolution has the characteristics of rapid evolution and high polymorphism and is easily affected by multiple factors.The intra-host HIV quasispecies can be formed in a short time.In recent years,high-throughput sequencing technology(HTS,also known as Next-generation sequencing,NGS),characterized by massively parallel sequencing,has been widely used to analyze complex populations,providing powerful tools for the study of HIV quasispecies dynamics.Through NGS,not only the consensus sequences-represents the dominant quasispecies-can be obtained,but also the location and frequency of low-frequency mutations can be obtained.Low-frequency mutations in individuals are intra-host single-nucleotide variations(iSNVs).A large number of studies have confirmed that iSNVs can provide enough information to define the polymorphism and dynamic changes of virus evolution in the host.Based on the above,the scientific questions were raised,1)Can the consensus sequence generated from NGS data represent the dominant quasispecies of HIV?2)What is the characters of dynamics of intra-host HIV DNA quasispecies under long-term ART?ObjectiveIn this study,by using NGS and iSNVs analysis,we systematically evaluated the accuracy of consensus sequences in the research of HIV dominant quasispecies,and study the HIV DNA quasispecies dynamic under long-term ART.The results can help us to in-depth understand the latent and pathogenic mechanisms of HIV.MethodsIn this study,the HIV-infected participants were separately selected from two HIV cohorts.In section one,29 participants and 33 whole blood samples from the long-term follow-up HIV cohort receiving the first-line ART,present the high level of HIV replication in host,and 33 participants and 42 samples from HIV-infected pregnant women cohort in GuiZhou,present the low level of HIV replication in host.In section two,according to the plasma viral load during the follow-up period,25 participants with virus suppression(VS group)and 20 participants with failed virus suppression(TF group)were randomly selected from a long-term follow-up HIV cohort in China.In section three,based on the results of the plasma viral load measurement during the follow-up period,73 HIV-infected pregnant women and 241 longitudinal dried blood spots were included for HIV DNA quantification.83 participants and 170 dried blood spots were included in the analysis of HIV quasispecies dynamics during pregnancy.The longitudinal blood samples were collected,and the DNA and RNA were extracted.Using specific primers,the reverse transcriptase region of HIV pol gene was amplified.The amplified products were subjected to NGS and Sanger sequencing.The key technology is mutation spectrum analysis technology.Its analysis process includes:Firstly,the raw data from NGS was download and filtered by quality control procedures.And then,low-quality sequencing reads were removed.Secondly,the sequencing reads were mapped with the reference sequence HXB2.Thirdly,according to the comparison results,the base composition of each base site was calculated separately.Then,the mutation profile of this sample is determined.Fourthly,according to the mutation profile results,the variation table was generated and the consensus sequences were obtained.Finally,we annotated all variations and determined SNPs and iSNVs’ based on variation frequency,respectively.The consensus sequences and iSNVs’ analytical datasets were generated from NGS data respectively.Using sequence alignment and un-identical base analysis,compare the differences between the Sanger sequence and the consensus sequence of the same sample,and evaluate the accuracy of the consensus sequence in the research of HIV dominant quasispecies.Participants and samples are grouped according to plasma HIV viral load and treatment results.At last,comparisons between groups are performed to describe the HIV DNA quasispecies dynamic.ResultsSection OneLinear regression analysis was performed on the sequence distance data of the paired sequence.The R-square(R2)of the sample before ART was 0.84,and the R2of the virus-suppression samples after ART was 0.984.The P values of the two analyses were far less than 0.01.The results of phylogenetic tree analysis showed that 29(90.6%)paired sequences before ART clustered on the same branch,and the sequences of drug-resistant samples were clustered with the sequence of samples before ART of the same participants.Among the samples after ART,33 pairs of sequences were clustered on the same branch(76.2%),and the remaining 9 samples were clustered with samples from the same participants at different sampling time points.Sequences from different sampling time points of the same participant can be clustered on the same branch to achieve 100%correct clustering.All mixed bases in the Sanger sequence can be detected by NGS,with a detection rate of 100%.The base accuracy rate of the consensus sequence was calculated by completely un-identical sites.The base accuracy rate of samples with low replication level was 99.9%,and the base accuracy rate of samples with high replication level was 99.6%.Among the completely un-identical sites,A-G sequencing errors occurred the most,which reached more than 50%.And the un-identical bases were dominated by base conversion,reaching more than 80%of all sites.Section TwoBy phylogenetic tree analysis,all the participants infected with the HIV subtype B.And all participants in the two groups was no obviously clustered.In the plasma RNA before ART,mutations were randomly distributed in the reverse transcriptase region of the pol gene,and there was no different distribution between the TF group and the VS group,but it was large between the participants.High-frequency mutations between whole blood DNA and plasma RNA are more likely to become co-occurring mutations,and the co-occurring mutation ratio is about 65%,and it remains stable before and after ART.In samples after long-term ART,the average number of iSNVs in HIV DNA and the average base frequency were statistically significant between the VS group and the TF group.By means of linear regression analysis of newly-emerged mutations,iSNVs and SNPs,the HIV evolution rate in TF was higher than that in VS group.It can also be seen that the HIV evolution rate of iSNVs is higher than that of SNPs.Based on the linear regression coefficient of the number of newly-emerged mutations,the rate of accumulation of HIV DNA mutations in the treatment failure participants was 0.02 mutations/kb/day.Analyzing the quasispecies of DRAMs,in the TF group,during the treatment process,drug-resistant sites accumulate rapidly,7%DRAMs in all iSNVs and 4%in all SNPs.Section ThreeOf the 241 dried blood spot samples(DBS)from HIV-infected pregnant women,HIV DNA viral load of 20.3%samples were below the detection limit,and 77.4%of the participants had at least one elevated HIV DNA level during pregnancy.By means of fold change calculated by the ratio of HIV DNA viral load at the former sampling time and at the next sampling time,the fold change of 51.6%samples were increase.The linear regression coefficient of HIV DNA viral load during pregnancy in each participants was greater than 0 in 52.8%of the participants.Using iSNVs’ number,iSNVs’ base frequency and diversity index,the difference of Shannon index was statistically significant between the HDL and LDL,The high level of HIV replication increases the iSNVs’ number in HIV DNA,decreases the average iSNVs’ base frequency,and increases the diversity of HIV DNA quasispecies.Regardless of the level of replication,HIV DNA quasispecies dynamics rapidly changed during pregnancy,and there are large differences between participants.Conclusion1.The consensus sequence generated by NGS can be used for the analy sis of intra-host HIV dominant quasispecies,and it present high accuracy.2.Using NGS and iSNVs analysis,the results show that HIV RNA and HIV DNA have different quasispecies diversity.The co-occurring mutations between the HIV DNA and HIV RNA was about 65%,which remains stable before and after ART.During the long-term ART,the intra-host HIV DNA quasispecies diversity continues to change.3.The high level of HIV replication drives the rapid intra-host evolution of HIV.In the HIV-infected people with failed viral suppression,the evolution rate of the HIV pol gene is 0.02 mutations/day/kb.The rapid accumulation of DRAMs in HIV DNA suggests that HIV DNA can be used as a biomarker for drug-resistant HIV monitoring and is an important aspect of studying the intra-host evolution of HIV DRAMs.4.Under effective ART,the level of HIV DNA viral load and intra-host HIV quasispecies dynamics changed greatly during pregnancy,suggesting that it is necessary to monitor the levels of HIV viral load and HIV reservoir during pregnancy.
Keywords/Search Tags:HIV, Intra-host single nucleotide variant, Next-generation sequencing, Consensus sequence, Quasispecies diversity, Antiretroviral therapy
PDF Full Text Request
Related items