| As a model organism of medicinal plants,Salvia miltiorrhiza has four genome versions published.These data have laid a good foundation for the molecular biology research of Salvia miltiorrhiza.Unfortunately,the genome sequences of each version are based on individual Salvia miltiorrhiza sequencing and assembly.These genomes cannot fully reflect all genetic information at the gene level of the species,and some important functional genes may be ignored.At the same time,the lack of a user-friendly online search service platform for genome information will also greatly limit the development of related fields.Therefore,in order to solve the above deficiencies in this study,we selected the 2020 version of the Salvia miltiorrhiza genome as the reference genome,constructed the pan-genome of Salvia miltiorrhiza based on a large number of genome resequencing data,and performed detailed gene annotations on it.Finally,by constructing an online service platform for Salvia miltiorrhiza genomics,integrating the above-mentioned annotation information,providing convenient information query and indexing services for researchers.The relevant research results are as follows:1.Comparison of the genomes of different versions of Salvia miltiorrhiza: The contig N50 and scaffold N50 of the 2020 version of the genome are 5 times and 50 times more than the other three versions,respectively.The heterozygosity value of the 2021 version of the genome is one-fourth of the 2016 version and the 2020 version.The 2021 version of the genome annotation lacks gene location information.After RNA-seq verification and protein integrity verification,it is determined that the 2020 version of the genome annotation is more accurate than the 2016 version.After the above considerations,we finally selected the 2020 version of the genome and its annotations for follow-up research.2.Assembly of the pan-genome of Salvia miltiorrhiza: Based on the re-sequencing data of 37 Salvia miltiorrhiza genomes(also including two published Salvia miltiorrhiza genome sequences)and the results of comparison with the reference genome,a 180 Mb non-reference sequence was screened out.This partial sequence It accounts for about23.90% of the entire pan-genome.3.Annotation of the pan-genome of Salvia miltiorrhiza: Using a combination of ab initio prediction,homology prediction and RNA-seq-based evidence to support prediction,7,153 coding sequences were predicted from non-reference sequences.We also annotated the alternative splicing events of 7,952 multi-exon genes out of all 36,389 coding genes.Among the 52,627 proteins encoded by 36389 genes,about 90% of the proteins have sequence similarities with known proteins in the eggNOG database,and 49% of them can be assigned to GO terms.4.Analysis of the tissue expression profiles of Salvia miltiorrhiza coding genes:Using RNA-seq data of Salvia miltiorrhiza tissues(roots,leaves,hairy roots,flowers and seedlings),all genes of Salvia miltiorrhiza were quantified by FPKM.The results showed that among all the genes,a total of 23,896 genes had FPKM levels higher than 1 in one or more tissues,316 genes,474 genes,321 genes,901 genes and 344 genes were specifically significantly expressed in root,hairy root,leaf,flower and seedling.More than half of the genes were expressed significantly in all tissues.5.MicroRNA annotation: We identified 64 microRNAs in the Salvia miltiorrhiza pqn-genome using sRNA-seq data,including 41 conserved microRNAs and 23 novel microRNAs.Novel microRNAs are not uniformly expressed in roots,stems,leaves and flowers.Some microRNAs tend to be specifically expressed in certain tissues.A total of 674 target RNAs were allocated to all microRNAs in target identification analysis,and GO analysis indicated that these potential targets were involved in the biological regulation of many cells.6.Construction of Salvia miltiorrhiza genome online retrieval service platform: We sorted out all the above analysis results and used Vue framework and HTML+CSS+Java Script front-end development language.This database is convenient for browsing,querying and analyzing the relevant research results obtained in this study.It will provide convenience for the research of Salvia miltiorrhiza molecular biology and lay the foundation for accelerating the research of Salvia miltiorrhiza molecular biology. |