| Miracle fruit(Synsepalum dulcificum)is a perennial shrub belongs to Synsepalum genus of Sapotaceae family,a family in the order Ericales.It is a rare national treasure economic fruit tree in tropical regions,known for miraculous sweetening glycoprotein,miraculin,which have potential to modify the sour flavors into sweetness for human taste.The taste turning property of miraculin is not only important for sweetener market,but also have excellent performance for the treatment of diabetes.Miracle fruit also contains several important nutrients including proteins,lipids,vitamins,amino acids,and dietary phytochemicals.Moreover,the extracts from different tissues of fruit possess medicinal importance.To date,miracle fruit has not been well studied to explain its edible,medicinal and ornamental properties.A high-quality miracle fruit reference genome will greatly facilitate the research to explore the function and evolution of miracle protein and to develop the medicinal value of various tissues of miracle fruit.Therefore,this study performed the whole-genome sequencing of miracle fruit to obtain the reference genome at the chromosome level.Based on the high-quality miracle fruit reference genome,combined with the metabolome and transcriptome,the study revealed the metabolite accumulation patterns and gene expression trends during miracle fruit development.In addition,the tissue-specific highly-expressed genes and high-content metabolites were identified,the molecular regulation basis for anthocyanin biosynthesis were preliminarily analyzed,and the potential reasons for the particularity of miraculin in miracle fruit species were further explained.Our research provides theoretical basis for subsequent genetic improvement of the miracle fruit.The main results are as follow:1.A high-quality chromosome-level reference genome of miracle fruitThe genome sizes of miracle fruit were estimated by flow cytometry analysis and Kmer analysis,respectively.And ploidy and chromosome number were investigated using karyotype analysis.Referring to the above results,a total of ~119.57 Gb(226×)of Pac Bio long reads and 104.90 Gb(~198×)Illumina short reads were obtained.Based on the strategy of Pacbio data assembly and Illumina data error correction,a contig-level genome with a size of 568.98 Mb was finally obtained,the number of contigs was 63 and the contig N50 was 14.14 Mb.We further scaffolded the genome to chromosome scale using ~122.72 Gb(231×)Hi-C clean data.Finally,we successfully clustered 63 contigs into 13 chromosomes with an anchor ratio of ~96.63% and generated a chromosome-level reference genome(~549.84 Mb).BUSCO assessment showed that 96.5% of the complete gene elements of S.dulcificum genome were covered by the BUSCOs plant set and LAI assessment score is19.15.Further,Illumina,Pacbio and Hi-C data were mapped back to the final assembled genome,and resulting in the coverage of 99.29,99.06% and 93.64%,respectively.In addition,96.56-98.27% of RNA-seq reads from 30 samples were well aligned to the assembled genome.The above data showed that the assembled genome of miracle fruit has high integrity,accuracy and continuity.2.The genome annotation of miracle fruitUsing a combination of de novo approach and homology search,repetitive sequences made up 53.60% of S.dulcificum genome.Long terminal repeat(LTR)was the main type,accounted for 40.20%.For protein-coding gene prediction,a combination of ab initio predictions,homology searching and transcriptome prediction strategies was used,and37,911 protein-coding genes were predicted in S.dulcificum.Non-coding RNA annotation identified 117 micro RNAs,761 transfer RNAs,215 ribosomal RNAs,and 94 small nuclear RNAs.In addition,1,967 TFs from 58 families were also identified.Combined with transcriptome data,it was found that 88.95% of the genes could be expressed in at least one tissue.By comparing the PFAM,COG,KEGG and GO databases for functional annotation,it was found that 81.37% of the genes could have functions annotated in at least one database.3.Comparative genomic analysis of miracle fruitThe genome of miracle fruit,obtained in this study is the first reference genome of the Sapodaceae family,has great importance to determine the evolutionary position of Sapodaceae at genome level.Therefore,based on the sequenced species of the 7 families of order Ericales,the representative species of eudicots,Vitis vinifera and Arabidopsis thaliana,were selected at the same time,and the monocot,Oryza sativa,was used as the outgroup,and the single-copy orthologous gene was used for species tree construction and divergence time assessment.A total of 293 single-copy orthologous genes from the 11 species were obtained.Phylogenetic analysis revealed that S.dulcificum was closely related to Camellia sinensis and Diospyros oleifera,and that,the S.dulcificum diverged from the Diospyros genus ~67.8 MYA(50.2-82.6 MYA),and C.sinensis diverged from Synsepalum ~63.5 MYA(45.4-78.5 MYA).The intergenomic collinearity indicated that the S.dulcificum genome has two-to-one syntenic collinearity relationship with V.vinifera.The self-collinearity analysis of S.dulcificum found that there are two corresponding linear regions in one region of the genome.Synonymous substitution rate(Ks)analysis found that a major peak of S.dulcificum was detected at Ks=0.56.It proved that S.dulcificum underwent a whole genome duplication(WGD)event.4.Gene family analysis of miracle fruitTo further explore the relationship between the specific traits and genes of miracle fruit,this study analyzed the specific gene family and the gene family that contracted and expanded in evolution.To clearly show the unique gene families of S.dulcificum,five important species(V.vinifera,A.chinensis,D.oleifera,V.macrocarpon and A.corniculatum)were selected for further analysis,and 1,041 gene families were found to be specific for miracle fruit.The GO enrichment of unique gene families were mainly enriched with terpenoid biosynthesis,phytoalexin metabolism,plant secondary metabolism,and the host regulation of virus defense responses.KEGG enrichment results indicated that,it was mainly enriched in monoterpene biosynthesis,brassinolide biosynthesis,cytochrome P450 synthesis,and ABC transport.Gene family analysis of 11 species revealed that,the miracle fruit contained 15,799 gene families.Combining the phylogenetic tree and species divergence time,a total of 18,640 gene families were inferred in the most recent common ancestor from11 species.In S.dulcificum,3,828 gene families expanded,and 4,739 gene families contracted.Enrichment analysis of significantly expanded gene families revealed a marked enrichment in genes involved in PPAR signaling pathway,phenylpropanoid biosynthesis,defense response to fungus,positive regulation of immune responses,DNA replication and other biological processes.Enrichment analysis of significantly contracted gene families revealed a significant enrichment in genes involved in monooxygenase activity,response to UV light,lignin biosynthesis,and monopost biosynthesis.5.Metabolomic and transcriptomic changes during fruit developmental stages and the analysis of regulatory basis of anthocyanin biosynthesisIn order to study the special components of each tissue of miracle fruit,the change of fruit color and the mechanism of its formation,metabolome detection and transcriptome sequencing of roots,stems,leaves,flowers,fruits and seeds of three different stages were carried out.A total of 737 annotated metabolites in 11 categories were identified from all samples based on a widely targeted metabolite analysis.The tissue-specific high-content metabolites analysis found that thete were 33,30,35,37,65 and 95 tissue-specific highcontent metabolites in roots,stems,leaves,flowers,fruits and seeds,respectively.Differential metabolite analysis showed that,the metabolites involved in the biosynthesis of phytohormones and the metabolism of vitamin B6 were continuously enriched during the transition from young stage to turning stage of miracle fruit,and further,metabolites in phenylpropane biosynthesis,glutathione metabolism and other pathways were enriched during fruit development from turning stage to mature stage.Transcriptome analysis found that 28,560 genes were expressed(FPKM>1)in 30 samples.Differentially expressed genes analysis showed that the up-regulated genes of fruit from young stage to turning stage were mainly enriched in flavonoid biosynthesis,phenylpropanoid biosynthesis,amino acid metabolism and other processes.The genes upregulated from the turning stage to the mature stage were mainly enriched in response to brassinolide,plant hormone signal transduction,MAPK signaling pathway and other processes.Present multi-omics data along with previously reported data indicated that,the changes in fruit color were caused by the accumulation of anthocyanins.The key genes from anthocyanin biosynthesis pathway before and after the discoloration stage were analyzed,and a total of 25 structural genes were identified to be involved in anthocyanin biosynthesis.In addition,10 positively-regulated and 10 negatively-regulated transcription factors were preliminarily identified that might regulate anthocyanin biosynthesis.6.Analysis of special properties and function of miraculinTranscriptomic analysis found that,the miraculin gene(MIR,Chr10G0299340)was significantly expressed in fruit flesh with FPKM value of ~113,515.Similarly,the homologous genes of miraculin were also identified in C.sinensis,V.vinifera and D.oleifera,and found that the expression level of MIR in miracle fruit was at least hundred times higher than that of the homologous gene.Furthermore,protein sequence analysis showed that MIR has a unique signal peptide sequence to guide the secretion,and a 30-position histidine residue with taste-modifying activity.Moreover,integrated analysis of WGCNA,enrichment and metabolite correlation suggested that,miraculin plays the potential role to regulate plant growth,seed germination and maturation,resisting pathogen infection,and environmental pressure.In summary,a chromosome-level high-quality genome of the miracle fruit was obtained.Through comparative genomic analysis,it was found that,a WGD event occurred during the evolution of S.dulcificum,and the positional relationship of Sapinaceae in plant phylogeny was also clarified.In addition,the metabolomic and transcriptomic changes during fruit development,and the molecular basis for the regulation of anthocyanin biosynthesis in miracle fruit were analyzed.Underlying mechanisms responsible for unique properties of miraculin and its functions to specie itself were also elucidated. |