Caragana Fabr.are about 100 species of in the world,among which 66 species are distributed in the northeast,north,northwest,and southwest provinces in China.China is one of the genus’ s important distribution and differentiation centers,it is crucial for systematic evolution and biogeographic research.Caragana’s related genera Halimoderon Fisch.ex DC.and Calopaca Fisch.ex DC.belonged to Fabaceae/Leguminosae,Papilioideae and they also belong to the inverted repeat-lacking clade(IRLC).According to the traditional classification,Caragana,Halimoderon,and Calopaca belong to Trib.Galegeae(Br.)Torrey et Gray and subtrib.Astragalinae(Adans.)Benth et Hook.f.,but recently molecular phylogenetic studies have shown that Halimoderon and Calopaca are nested in the Caragana,and the phylogenetic relationships still need further study.Based on the above scientific questions,this study sampled most species of three genera in China,studied the phylogenetic relationship of the genus using chloroplast genome and nuclear ribosomal DNA(nr DNA)data,and studied the origin and evolution of the genus through ancestral character evolution,ancestral region reconstruction and divergence estimation.At the same time,the chloroplast genome structure,codon preference,and gene selection pressure of Caragana are studied in order to lay the foundation for further research on the genetic resources of Caragana.The results show that:1)A total of 65 complete chloroplast genomes were identified,ETS and nr DNA data were also extracted.Their chloroplast genome size range from 125,272 bp to 133,621 bp.A total of108~111 genes are annotated,including 75~77 protein-coding genes(CDS),29~30 t RNA genes,and 4 r RNA genes.The total GC content is 34.2%~35.2%.Most genomes contain 76 CDS,all species of C.sect.Jubatae,C.rosea,and C.soongorica chloroplast genome genes are a single copy,but trn N-GUU t RNA gene has two copies in other species.The nr DNA sequence length ranges from 5,608 bp to 5,836 bp,and the GC content ranges from 53.3% to53.8%.The nr DNA is composed of three genes(18S,5.8S,and 26S)and two internal transcribed spacer(ITS;ITS1 and ITS2).The ETS was approximately 590 bp in length.2)Through genome comparison,the most abundant SSRs is Mononucleotides,among poly A and poly T are the most common motifs.About large repeat sequences,the forward repeat and palindrome repeat are more abundant.It is noteworthy C chinghaiensis has a palindrome sequence of 2,836 bp.3)Phylogenetic analysis show that the two genera are inserted into Caragana,which is in agreement with previous studies.The phylogenetic tree bases on the chloroplast genome supports the monophyly of C.sect.Caragana,C.sect.Bracteolata,C.sect.Frutescentes,C.sect.Tragacanthoides,and C.sect.Calopaca.The C.sect.Jubatae is divided into 2 branches(I;II),clade I is separated from the C.sect.Jubatae and renamed as C.sect.Tanguticae.The branches of these monophyletic have strong support[PP=1,BS(bootstrap)=100%].The C.sect.Spinosae and Halimoderon halodendron formed a strong supporting monophyly[PP=1,BS=100%],for which the Halimoderon is removed and classified into the C.sect.Spinosae.The phylogenetic tree based on ETS and nr DNA is different from the chloroplast genome phylogenetic tree,but the species under the section are basically the same.The results support the monophyly of C.sect.Caragana(PP=1,BS=100%),C.sect.Bracteolata(PP=1,BS=99%),C.sect.Tragacanthoides(PP=1,BS=100%),and C.sect.Calopaca(PP=1,BS=100%).The phylogenetic results of C.sect.Jubatae are the same and also divided into two branches.C.sect.Tanguticae(clade I;PP=1)is monophyletic,and its relationship with C.sect.Frutescentes is not clear,and the support rate was PP=1.Then C.sect.Halimoderon is a sister claded with it(PP=1,BS=96%).4)The ancestral region of Caragana is the desert region of southern Xinjiang and western Inner Mongolia.It originated from 27.47 Mya(95% HPD: 16.48-38.44 Mya)in the late Oligocene,and main branches occurred in the Miocene,corresponding to the climate cooling event and the QTP uplift.The characters evolve from pinnate leaves with many leaflets and single inflorescence to four leaflets with pseudopalmate and flowers in pairs on a peduncle,and even the inflorescence has a reverse evolution of raceme.5)Among the codons of Caragana,UUA(L)is the most preferred codon,and only one of the 29 codons(RSCU>1)ended with G,it indicates that the genus prefers to end with A or U,consistent with other dicotyledonous codon usage patterns.The neutral analysis and ENC-plot analysis show that the genus is affected by natural selection.6)The selection pressure analysis of clp P,ycf1,rbc L and mart K genes show that compared with the conservative rbc L and mart K genes,the clp P and ycf1 genes of Caragana have shorter branch length and lower nonsynonymous(d N)substitution rate on the synonymous(d S)substitution rate and d N substitution rate trees.Three significant positive selection genes(clp P,rps18 and rps7)and nine rapidly evolving genes are identified in the selection pressure analysis,which would enable us understand the adaptation of Caragana species. |