| Legumes(Fabaceae or Leguminosae)are the second most important crops for humans as they contribute a significant portion of protein and oil intake to human and animal diets.Over the years,constant efforts have been dedicated to the improvement of legume productions and qualities through using various approaches such as breeding,agronomic practices and genetics.Since genomic information is essential for crop improvement programs,multiple genome assemblies in legumes have made available,including legume model species(M.truncatula)and the king of forage,alfalfa(M.sativa).These genomes provide valuable resources for studying legume genomics as well as crop breeding and improvements.In recent years,M.polymorpha has been increasingly used in both animal feeds and human consumption.Due to its various nutritional components and edibility qualities,it is consumed either as cooked or fresh in China.Asides its economic value,M.polymorpha is a critical ecological plant in various farming systems worldwide as the self-reseeding,highly effective nitrogen fixer and organic soil improver.However,despite its importance,information about its genome is still lacking.Here,we used an integrated approach including Illumina,PacBio and Hi-C technologies to sequence and assemble the genome of M polymorpha.The present study mainly focusses on following points,including:1)comparative genomic analysis of M.polymorpha and other Medicago plants;2)genetic basis of lignin biosynthesis in M.polymorpha;3)detection of nutritional substances in M.polymorpha;4)characteristics of HSF family genes in M.polymorpha.(1)There were general differences in leaf length,leaf widths,leaf area,the number of branches and the height of plants of different M.polymorpha materials.In addition,the yield varied widely between materials,with the maximum being six times as much as the minimum.Correlation analysis showed that the yield of fresh grass was positively correlated with leaf length,leaf width and leaf area,and the correlation coefficients were 0.565,0.623 and 0.594,respectively.Further cluster analysis of the agronomic traits and nutritional quality of M.polymorpha from different cities showed that the leaf size,fresh grass yield and relative feeding value of the M.polymorpha from Yangzhou were better than those from other cities.Therefore,we selected the M polymorpha from Yangzhou for whole genome sequencing.(2)The genome of M polymorpha ’Huaiyang Jinhuacai’ was sequenced using the PacBio RS II platforms.A total of 57.32 Gb PacBio long reads were achieved,approximately 117.18fold high quality sequence coverage.To assist the assembly correction,Hi-C data(62.59 Gb)was used,and consequently a contiguous reference genome of 457.53 Mb was generated with a contig N50 of 11.02 Mb and scaffold N50 of 57.72 Mb.In addition,92.92%of the genomic sequences were anchored onto seven pseudochromosomes.We predicted a total of 36,087 protein-coding genes,99.0%of which were functionally annotated.Furthermore,we identified 166.62 Mb of repetitive sequences(accounting for 38.04%of the assembled genome)in the genome of M.polymorpha.We subsequently performed homology searches and annotated noncoding RNA genes,yielding 2307 miRNAs,817 transfer RNAs,2281 ribosomal RNAs,and 1334 small nuclear RNAs.(3)Comparative genomic analysis revealed that the divergence between M polymorpha and the common ancestor of M.truncatula and M.sativa,which occurred about 15.3 million years ago(Mya).Based on the abundance of 4DTv sites,one significant peak was found in the M.polymorpha,M.truncatula and M.sativa genomes,indicating that the whole-genome duplication(WGD)event occurred before the divergence of G.max and Medicago,supporting the ancestral Papilionoideae WGD event.Synteny block analysis showed strong genomic syntenic relationships among M polymorpha,M.truncatula and M.sativa.A chromosomal fusion event observed in the synteny results indicates that chromosome 3(chr 3)of M.polymorpha arose from a fusion between chromosomes 3 and 7 of the common ancestor of M.truncatula,M.sativa,and M.polymorpha.(4)We collected the aboveground parts of M polymorpha at three different growth stages(S1:seedling stage;S2:early flowering stage;S3:late flowering stage)for transcriptome and metabolomic analysis.Differential analysis of transcriptome data showed that there were 860 common differential expressed genes(DEGs)among S1,S2,and S3.Functional analysis revealed that the DEGs were enriched in carbon metabolism,citrate cycle and oxidative phosphorylation etc.pathways.Metabolomics results revealed a total of 492 annotated metabolites belonging to 9 distinct biochemical groups of flavonoids(107);lipids(67);phenolic acid(66);amino acids and derivatives(59);nucleotides and derivatives(39);alkaloids(28);terpenoids(11);organic acid(37)and other substances(78)such as 13 vitamins,24 saccharides and alcohols.Differential analysis showed that most nutrition metabolites such as flavonoids,triterpenoid saponins,amino acids were accumulated in S2 stage,which indicated the high nutritional value of M.polymorpha.(5)We compared stem structure differences between M polymorpha and M sativa using semi-thin section.The sections showed that cell layers in the xylem of M sativa were thicker than those in M polymorpha,indicating a lower lignin content in M polymorpha stems.The M.polymorpha genome contains 65 lignin-biosynthesis-related genes,while 142 are present in G.max,77 in M.truncatula and 77 in M.sativa.The numbers of genes encoding HCT,CCoAOMT,COMT,and particularly laccase,were significantly lower than in the other species tested.Using transcriptome data,we identified 36 lignin-biosynthesis-related genes,of which 19 and 11 were up-regulated at S3 and S2,respectively.This is consistent with the metabolomic results,where several metabolites in the lignin biosynthesis pathway,especially p-coumaryl alcohol,coniferyl alcohol and sinapyl alcohol contents,were increased significantly from S1 and S2 to S3.In addition,the content of sinapyl alcohol was lower than coniferyl alcohol at S2.(6)The crude protein contents of various cultivars of M polymorpha at the early flower stage were test.The results showed that the crude protein of M polymorpha ranged from 17.8%-22.2%.Comparative genomics revealed a total of 18 positively selected genes involved in protein metabolic processes.Furthermore,the results of different metabolites showed that 26 amino acids and derivatives were significantly accumulated in S2 or S3,suggesting these positively selected N metabolism-related genes could be responsible for crude protein biosynthesis.(7)Based on our M polymorpha genome sequences,22 nonredundant HSF genes,named MpoHSFs were identified.The chromosomal localization results showed that 22 MpoHSFs genes were unevenly distributed on 7 chromosomes.Gene structure analysis revealed that most MpoHSFs have only one intron,6 MpoHSFs have 2 introns,and only one MpoHSF have 3 introns.The cis-acting elements of the HSF gene promoter of M.polymorpha are mostly related to drought,anaerobic induction,methyl jasmonate response,abscisic acid and gibberellin response,suggesting that the HSF transcription factor family genes of M.polymorpha play important roles in plant resistance. |