Font Size: a A A

The Impact Of Sequencing Depth And Errors On The 16S RRNA Gene Sequencing-based Studies Of Microbial Diversities

Posted on:2019-03-09Degree:DoctorType:Dissertation
Country:ChinaCandidate:J WangFull Text:PDF
GTID:1360330590970489Subject:Biology
Abstract/Summary:PDF Full Text Request
The 16S rRNA,with approximately 1540 nt in length,exists in all bacterial ribosomes.Owing to its conservative function and structure,the 16S rRNA gene amplicons are generally sequenced through high-throughput sequencing technology to reveal the structure and diversity information in microbial communities,with the similarity between sequences representing the phylogeny relations,and the frequency of sequences representing the abundances of the corresponding microbiota The accuracy of obtaining taxonomic features based on 16S rRNA gene sequencing is crucial in the structural discrimination of microbial communities and the identification of key functional speciesThe first part of this thesis discussed the influence of sequencing depth on the diversity profiling of microbial communities.With simulations based on the distribution of operational taxonomic units(OTU)within the microbial community,a rarefaction curve is made to reflect the alterations of alpha index along with the increase of sequencing depth.Generally,the sequencing depth is considered to be sufficient when the rarefaction curve has already reached plateau However,we found that different alpha diversity indices had varied rarefaction curves,thus lead to disagreement with satisfaction of sequencing depth.Moreover,the changing patterns of alpha diversity indices were inconsistent with those of beta diversity indices and could not keep pace with the improvement of grouping separation along with the increase of sequencing depth.Hence,we recommend to use multiple criteria combined with sub-resampling simulation to evaluate the influence of sequencing depth on both alpha and diversity indices and also to judge whether the sequencing depth is sufficient.Through demos,we suggest that in human microbial studies,it is better to set sequencing depth at>=5,000 sequences per sampleThen we discussed the influence of sequencing errors on the accuracy and consistency of obtaining taxonomy features in microbial communities,and provided an efficient solution.We found out that a portion of sequencing errors still remained after stringent quality filtering protocol in mainstream pipelines.These errors could introduce a large number of spurious taxonomic features in downstream analysis.To minimize the influence of such sequencing errors on the characterization of taxonomy,we proposed an approach that contains two steps,an abundance filtering(AF)step and a subsequent AF-based OTU picking and remapping(AOR)step.The AF step is based on the concept of detection limit.It utilizes bootstrapping simulation to identify and filter out the unreliable low-abundance sequences;Then AOR approach performs de novo OTU delineation on the reliable high-abundance sequences to obtain accurate taxonomy features.In the meantime,the low-abundance sequences filtered out at AF step can be maximally rescued back once they adhere to the similarity criteria of the obtained OTUs.This two-step approach was evaluated in several data sets,including mock communities that were constructed and sequenced in our lab,simulated communities based on the reference sequences in 16S rRNA gene database,and four published real data sets.The results showed that our approach can minimize the effect of erroneous sequences on the diversity analysis of microbial communities,thus reducing the risk of misleading on the downstream experiences,analyses and biological conclusionsIn the last part,we applied our proposed pipeline to a real 16S rRNA gene sequencing data,which were obtained from chronic hepatitis B(CHB)patients and the corresponding healthy controls.We found structural and functional dysbiosis of gut microbiota in the CHB patients and introduced a gut dysbiosis index(GDI)to measure the systematic changes in gut microbiota through the comparison of "bad vs.good"species.The intergrated analysis of gut microbiome and host's serum metabolome suggested that the changes of gut microbiota may cause the abnormal accumulation of serum aromatic amino acids(AAA),which are crucial for the development of liver fibrosis,cirrhosis and hepatocellular carcinoma.We concluded that the gut dysbiosis may participate the development of CHB to liver cirrhosis by altering the metabolic profile in patientsIn summary,this thesis discussed several technical issues in the 16S rRNA gene sequencing-based structural and functional analysis of microbial communities,and provided well-focused improvements Finally,the application value of our improved analysis pipeline was demonstrated in a real study.
Keywords/Search Tags:microbial ecology, 16S rRNA gene sequencing, methodological improves
PDF Full Text Request
Related items