| Accurate and complete genome sequence and annotation are the basis and the prerequisite for subsequent analysis and experiments.Although the genome sequence of Fusarium graminearum has been completed more than 10 years,there are still some errors in the genome sequence and annotation observed in our use,including mis-assembly regions,base error,wrong prediction of gene structure,lacking annotated untranslated regions(UTRs),transcript isoform and long non-coding RNA(lncRNA).These errors affect the follow-up gene function analysis and gene function research.Since the publication of the genome,scholars have carried out many RNA-Seq transcriptome sequencing analyses for F.graminearum.However,these studies mainly focused on gene differential expression,little is known about the occurrence and regulation of RNA processing and degradation mechanisms such as alternative splicing(AS),nonsense mediated mRNA degradation(NMD),and alternative polyadenylation(APA).This information is crucial for understanding the transcriptional regulatory mechanisms involved in F.graminearum development and pathogenesis.In addition,the existing transcriptomic sequencing data are mainly from the second-generation Illumina platform,and most of them are common non-chain specific RNA-Seq data,which is not conducive to transcriptomic assembly and accurate identification of transcript structures and isoforms.In view of this,we comprehensively analyzed F.graminearum genome and transcriptomes at different developmental stages using the third-generation PacBio single molecule real-time sequencing(SMRT)technology and the second-generation Illumina strand specific(ss)RNA-Seq technology.The main results are as follows:1.Correction of genome sequences.The PacBio SMRT genome sequencing results of F.graminearum PH-1 strain(YL)stored in our laboratory were assembled into 10 contig without reference.By comparing with the existing PH-1 genome assembly(RR1),we identified 315 distinct regions/loci.Illumina DNA-seq resequencing was performed on PH-1 strains(YL,ZJU,MSU)from three different laboratories in order to exclude the possibility of natural mutation during strain preservation.Based on the data,7mis-assembly regions,200 base errors and 42 indels errors were corrected.The corrected PH-1 genome was named YL1.2.Reconstruction of transcriptome annotations.A comprehensive gene/transcript annotation of F.graminearum was constructed by Iso-Seq and ss RNA-Seq of mixed RNA samples from 6 different tissues/stages,and independent RNA samples from vegetative and sexual stages.Using the new Iso-Seq transcriptome pipeline developed in this study,transcriptomic annotations containing 13,712 genes and 47,589 transcripts were obtained.The Iso-Seq transcriptomic annotations has increased to an average of 3.5 transcripts per gene,compared with just 1 transcript per gene in the previous genome annotation.The median transcript length was 2,348 bp in Iso-Seq transcriptomic annotations,nearly 2 times higher than the median transcript length in the RR1 annotation(1,462 bp).Compared with RR1 annotation,63.4% of the full-length transcripts were new transcripts and 12.2% of new transcripts were from new gene loci.Overall,the Iso-Seq transcriptome covered10,958(77.5%)of the existing RR1 annotation genes,while the remaining 3,187 genes were not obtained due to low expression.To further refine full-length transcriptome annotation,ss RNA-Seq was performed on different tissues.Through transcript assembly,we obtained 1,003 full-length RNA-Seq transcripts of 882 genes lacking Iso-Seq transcripts.In addition,301 full-length RNA-Seq transcripts from the new gene were obtained fromRNA-Seq.By combining Iso-Seq,RNA-Seq and 2,305 RR1 genes without transcript annotations,a complete annotation(YL1)containing 17,189 genes and 51,617 transcripts was finally generated.3.Alternative splicing and its regulation.Based on YL1 gene/transcript annotation,54,613 alternative splicing events were identified from 4,997 genes.These alternative splicing events resulting in 12,232 alternative splicing transcripts,indicating that alternative splicing significantly increased the complexity of F.graminearum transcriptome.Intron Retention(IR)(61.9%)is the main alternative splicing type.Based on the data of ss RNA-Seq for different stages/tissues,we found that alternative splicing has tissue and stage specificity.Intron exclusion transcript was mainly expressed in young tissues,while intron inclusion transcript was mainly expressed in aging tissues.It is possible that depressed expression of spliceosomal genes is partially related to the increased intron retention in the old or dormant tissues.Compared with the classical transcript,78.5% ORF region was changed by alternative splicing.Alternative splicing transcripts usually have shortened ORF,indicating that alternative splicing has an important contribution to the proteomic diversity of F.graminearum.Premature termination transcript due to alternative splicing is generally considered to be recognized and degraded by the NMD pathway.Interestingly,the deletion of FgUPF1,the core component of the NMD pathway in F.graminearum,did not increase the expression of NMD candidate transcripts,but caused the down-regulated expression of ribosome biogenesis related genes.Therefore,alternative splicing may be not coupled to NMD generally in F.graminearum.4.Alternative polyadenylation and its regulation.64.8% genes in F.graminearum have two or more alternative polyadenylation,indicating that alternative polyadenylation also significantly increased the complexity of F.graminearum transcriptome.Based on the YL1 gene/transcript annotation and the data of ss RNA-Seq for different stages/tissues,we found that the short 3’-UTR transcript was the predominant expression type in F.graminearum.However,the proportion of long 3’-UTR transcript in conidia increased significantly compared with other stages or tissues.Interestingly,we found that the proportion of 3’-UTR transcript increased gradually during 3-10 d sexual development and12-48 h hyphae growth,indicating that the 3’-UTR transcript was related to tissue aging.Increased distal PAS usage in aging and dormant tissues may be due to global down-regulation of these core 3’-end processing factors.We also analyzed the 3’processing factor in F.graminearum for the first time.Based on the ss RNA-Seq data for 3’end processing factor deletion mutant(FgFIP1,FgHRP1,FgRNA15),we found that FgRNA15 played a critical role in the identification of polyadenylation sites in F.graminearum,indicating the importance of FgRNA15 in the regulation of the proximal PAS usage.Unexpectedly,we observed cases of intron splicing changed influenced by nonalternative polyadenylation in FgFIP1,FgHRP1 and FgRNA15 deletion mutants,suggesting that the FgRNA15,FgHRP1,and FgFIP1 genes may promote intron splicing in F.graminearum.5.Identification of long non-coding RNA and polycistronic transcript.Based on the YL1 gene/transcript annotation and the data of ss RNA-Seq for different stages/tissues,a total of 5,481 lncRNA were identified,of which 5,303 were novel.Antisense lncRNA is the main type of lncRNA,accounting for 44.6%.LncRNA has stage and tissue specificity,among which the sexual stage is the most.Total 914 polycistronic transcripts in F.graminearum were identified for the first time,involving 698 protein-coding genes.Polycistronic transcript has stage and tissue specificity,implying that their expression may be under the control of stage specification signals.Different from prokaryotes,the genes in the polycistronic transcript have independent transcription ability.Interestingly,in comparison with their upstream genes within the same polycistronic transcripts,the expression of downstream genes was generally lower in vegetative growth stage but higher in sexual stage in independent biological replicates,suggesting that the expression of downstream genes may be repressed by the upstream readthrough transcription during vegetative growth but induced during sexual reproduction.Therefore,the polycistronic transcripts may play distinct regulatory roles during vegetative growth and sexual reproduction.6.Construction of F.graminearum online database.To facilitate the use of new genome sequence(YL1)and YL1 gene/transcript annotation by researchers,we established an online database of F.graminearum(FgBase).Database web site is http://fgbase.wheatscab.com/.FgBase already has the function of view genome online,BLAST,multi-version gene ID queries,and batch download sequences.We will add more functions in the future. |