Font Size: a A A

Study On Identification And Systematic Classification Of Dendrobium With DNA Barcode

Posted on:2019-02-28Degree:MasterType:Thesis
Country:ChinaCandidate:Z M ZhongFull Text:PDF
GTID:2404330548985482Subject:Pharmacy
Abstract/Summary:PDF Full Text Request
ObjectiveDendrobium Sw.with about 1200-1500 species,most of them have important medicinal and ornamental value.Due to the similar morphological characteristics among inter-/intra-species,wide distribution and numerous hybrids,the medicinal materials with variable quality from Dendrobium Sw.and adulterants appeared in the market,so its hard to ensure medication safety and medical quality.At present,DNA barcode technology was considered as a promsing,rapid and accurate method to identify species and phylogenetic anglysis.Although the DNA barcode with single site or multiple sites can be used to identify species of Dendrobium Sw.in different degrees,the identification effect has certain limitations.With the continuous maturity of genome sequencing technology,the whole chloroplast genome(cp-genome)of plants was often used for taxonomic identification and phylogenetic analysis.We obtained cp-genome sequence of Dendrobium officinale‘Zhongke No.4’ based on Illumina HiSeq 4000 and Pacific Biosciences RSII sequencing platform.Draw its cp-genome genetic structure and do gene annotation.With the cp-genome of Dendrobium officinale‘Zhongke No.4’ as the reference,resequenced the cp-genomes of 50 samples by Illumina HiSeq 4000.Bioinformatics and population genetics were analysed based on the 51 sequences,searching SNP and inDel loci and constructing system evolutionary tree to describe the relationship among Dendrobium Sw.species and establish the identification system of Dendrobium Sw.Comparing the ability of identification and classification of encoding genes,non-coding genes,SNP rich single gene and different gene combinations on the cp-genome,screened for the best gene on identification and classification as the candidate DNA barcode.The total DNA of 79 specimens were abstracted as the template,and PCR amplification,sequencing and clusteranalysis were done based on both candidate barcode and ITS2,matK.We compared the identification and classification effect of cp-genome,candidate barcodes,ITS2 and matK sequences in order to verify and evaluate the most effective used DNA barcode of Dendrobium Sw.Methods1.Samples collected and DNA extraction: collecting fresh leaf and clean with 70%alcohol and then preserved in-80 ℃.Extracted cp-genome DNA,NanoDrop2000,Qubit and agarose gel electrophoresis were used to detect DNA purity,DNA concentration and DNA integrity,respectively.Covaris S220 was used to interrupt genome DNA as fragments and build library,bridge PCR was used to obtain DNA cluster and linearized DNA amplification to single stranded.2.Cp-genome sequencing and analyze of original sequencing data: Truseq Kit v3 SBS-HS(300 cycles)Kit method was used to Illumina Hiseq4000 sequencing.PacBio RSII sequencing: using the method of G–tubes.The bases distribution and fluctuations quality for all sequencing reads were statistical by biological information methods.Shear the original quality of Illumina Hiseq.Drawn growth degree distribution of each filter read original data.3.Cp-genome assembly and annotation: SOAP denovo(v2.04)and BLASR were used to assembly preliminary Illumina sequencing and Pacbio sequencing data,respectively.Assemble with Celera Assembler 8.0 software subsequently.The Illumina data is used for calibration again,and performed gap closing operation with GapCloser v1.12.Using DOGMA to predict and annotate the obtained sequence,and draw the cp-genome structure diagram of Dendrobium officinale‘Zhongke No.4’.Use blastP(BLAST 2.2.28+)based on string v9.05 database to obtain the COG annotation result.The result was used to classify functions of proteins.Predicted genes were compared with the gene database(Genes)of KEGG by using BLAST algorithm(blastX/blastP 2.2.28+),obtained corresponding genes to participate in biological pathway according to the KO number by comparison.The blast results were analyzed by blast2 go.4.Build identification system of “cp-DNA-Barcode”: resequenced the chloroplastgenome of 50 Dendrobium samples with Illumina.BWA software was used to compare the sequences with reference genome sequence and calculate the relative to the reference genome sequencing depth and coverage.The GATK software detects SNPs and InDels of the genome,and obtains the group SNP collection by local script.Mega7.02 was used to construct the phylogenetic tree with Maximum Likelihood(ML)method and infer the relative distance among Dendrobium Sw.The principal component analysis(PCA)analysis used SNP data.5.Screen candidate DNA barcodes based on cp-genome sequences and SNP division information.Extract DNA of samples,PCR amplification and sequencing were been done with ITS2,matK and the candidate DNA barcode.BioEdit 7.0,SnapGene Viewer 2.6.2 and Mega 7.02 were used to splice,assembly and analysis sequences,and the data were submitted to GenBank,construct phylogenetic trees based on the candidate DNA barcode,ITS2 and matK sequences,analysis and compare their effect on the identification and classification of Dendrobium Sw.species.and predict the secondary structure of ITS2 sequences.Results1.The whole sequences sequencing scheme of cp-genome based on the Illumina platform and the PacBio RSII platform was established.The high quality cp-genome sequence of Dendrobium officinale‘Zhongke No.4’ was assembled,the total length of152,185 bp and make up of a large single copy(LSC,85,094 bp),a small single copy(SSC,14,521 bp)and two inverted repeat regions(IRs,26,285 bp).The GC content was 37.46%.A total of 127 genes were successfully annotated,including 89 protein-coding genes,30 tRNA genes and 8 rRNA genes.The portions of protein-coding genes,tRNA genes and rRNA genes were 53.86%,1.48% and 6.92 % in the 152,185 bp,respectively.Twelve genes contained one or two introns,containing the protein-coding genes,atpF,clpP,ndhB,ndhB,ndhF,rpl2,rpl2,rpoC1,ycf1,ycf15,ycf15,ycf3.2.The results of COG annotation show that the number of genes involved in translation,ribosomal structure and biological origin,energy generation and transformation are the most,33 and 26 respectively.KEGG gene function annotation analysis showed thatthe number of genes involved in metabolic pathway,photosynthesis and ribosomes was the most,38,29 and 21 respectively.GO notes that the participation rates of 66 genes in cell composition,biological pathway and molecular function were 50% ~ 100%,1.51% ~93.93% and 9.09%~63.63%.3.The cp-genome of 51 Dendrobium Sw.plants was obtained by resequencing,35,685 SNP sites and 3,944 InDel were contained among 51 cp-genome sequences of Dendrobium samples.The proportion of SNP mutations in introns,exons,upstream,downstream and gene interval regions was 31.10%,28.49%,23.21%,15.22%,and 1.98%respectively.The percentage of InDel mutations in introns,upstream,downstream,exons and gene-spacing regions of genes was 44.27%,31.52%,17.60%,4.54%,and 2.08%.4.Based on the K2 P model of intra-and inter-specific divergence,the inter-specific distance was in the range of 0.000-0.025,equaled zero for only 0.08%,and the proportion for which inter-specific genetic distance was >0.0075 was 88.1%.The intra-specific distance was in the range of 0.000-0.005,and most Dendrobium species with more than two samples in this study had a unique sequence(88.24%)in the cp-genome region.The phylogenetic tree constructed by ML method indicated that the identification rate of“cp-DNA-barcode” among 51 Dendrobium species is 100%.5.ycf1 b and clpP-psbB were selected as candidate DNA barcodes based on the information of cp-genome sequence and SNP distribution information.The discrimination rates of matK,ycf1 b and clpP-psbB for 51 Dendrobium samples were 96.25%,97.50% and97.50% respectively.While the sample size was 80.the discrimination rates of ITS2,matK,ycf1 b and clpP-psbB were 90.00%,87.50%,92.59% and 45.00%.In addition,the clustering result based on clpP-psbB sequence is different from the traditional classification results with low supported.The clustering results based on the sequence of ITS2,matK and ycf1 b were all with higher supported.According to the configuration,size,arm loops number,arm loops position and the differences of arm angle of four spiral arm,the secondary structure of ITS2 sequence of 79 samples can be roughly divided into four categories,A,B,C and D.The secondary structure of ITS2 sequence also has certain differences between two samples from the same sepcies but different origins,such asD.officinale,D.primulinum,D.aurantiacum and D.lituiflorum,etc.In addition,D.anosmum and D.williamsonii,D.christyantum and D.cariniferum,D.hainanensis and D.hancockii have close genetic relationships based on ITS2,matK and ycf1 b sequences.Conclusion1.The identification rate is 100% based on cp-genome of 51 samples in this study.Three D.officinale plants from different origin(Zhejiang,Lang mountains and Sichuan)could be identified effectively.However,more than 1200 species were involved in Dendrobium,whether the cp-genome can solve the problem of identification of the genera completely is still need a large number of further studies to verify.2.The ML trees constructed based on coding and non coding genes,single and 2,3,10,16 genes combination results show that the the topology,classification results and identification efficiency based on ycf1 b tree was most similar to cp-genome.The system classification and identification efficiency based on 2,3,10,16 genes sequence were not better than ycf1 b.The results of verification tests also prove that ycf1 b sequences have a better effect on classification and identification among Dendrobium Sw.species.3.ITS2,matK,ycf1 b sequences can be used to identify most of samples in this study but neither of them can be used to identify all Dendrobium Sw.species,and along with the increase of the number of sample the species identification rate is corresponding decline whether it is based on matK,ycf1 b or clpP-psbB sequences.The classification results based on the sequence of ITS2,matK and ycf1 b are more reliable for their bootstrap values were all higher than clpP-psbB sequences.The sequences information provided a reliable basis for the identification of most Dendrobium Sw.species,and afforded important system evolution information for the species systematics and bioinformatics research.The secondary structures of ITS2 sequences can be used to assist in analyzing evolution of the relationship among Dendrobium Sw.The explore train of thought and methods of this study provided an important reference for the future research of DNA barcode among Dendrobium Sw.
Keywords/Search Tags:Dendrobium Sw., Chloroplast genome, DNA barcode, Classification and identification, Phylogenetic analysis
PDF Full Text Request
Related items