Font Size: a A A

Global Identification Of Alternative Splicing Genes And Genome Re-annotation Of The Woodland Strawberry Fragaria Vesca

Posted on:2020-04-30Degree:DoctorType:Dissertation
Country:ChinaCandidate:Y P LiFull Text:PDF
GTID:1363330572484930Subject:Pomology
Abstract/Summary:PDF Full Text Request
The cultivated strawberry(Fragaria x ananassa)is a perennial herb;because of its bright color and rich nutrition,it is deeply favored by consumers.Botanically,the fruit of strawberry is an accessory fruit,as it consists of the abundant dry achenes(the botanical fruit)that dot the surface of a fleshy and juicy shoot tip,the receptacle.Unlike other Rosaceae crops,such as apple and peach,strawberry fruit is non-climacteric.Cultivated strawberries are octoploid(2n=8×?56),whose genome originates from four diploid progenitor species and is very complex.One of its ancestors,the diploid strawberry F.vesca,also known as woodland strawberry,is the most widely distributed species naturally growing in the northern hemisphere.The F.vesca genome(?240Mb)is relatively small,providing significant advantages for genomic research.Furthermore,F.vesca is short stature,short life cycle,repeated flowering and efficiently transformed.Therefore,it is a model material for studying strawberry and non-climacteric fruit.In this study,Illumina and SMRT sequencing techniques were eombined to analyze the dynamic changes of alteroative splicing during strawberry flower and fruit development.We also reannotated two versions of F.vesca genome,which significantly improved the accuracy and completeness of the genome annotation.The main results were as follows:1.SMRT has higher efficacy in alternative splicing identificationIn this study,the AS landscape was characterized and compared between the single?molecule,real-time(SMRT)and Illumina RNA-seq platforms.We identified 33,236 full-length isoforms from 10,957 gene models in strawberry annotation.While SMRT has a lower sequencing depth,it identifies more genes undergoing AS(57.67%of detected multiexon genes)when it is compared with Illumina(33.48%),illustrating the efficacy of SMRT in AS identification.2.AS dynamics during woodland strawberry fruit developmentTo profile AS during strawberry fruit development,our previous datasets were employed with 74 RNA-seq libraries including finely dissected fruit tissues at five development stages and add up to a combined 1,951 million reads.A combined total of 66.43%multiexon genes were found to be alternatively spliced,and IR is the most frequent AS type followed by AA,AD,and ES.In addition,2,543 genes were identified to have a gain or loss of conserved domains as a result of AS.Next,we characterized the AS dynamics during fruit development,Consequently,we found that the intron retention(IR)was significantly reduced immediately post-fertilization in strawberry fruit;in contrast,the percentage of AA was greatly increased post-fertilization in fruit.KEGG pathways of 'Spliceosome' and GO terms of several important metabolic processes were significantly enriched among the genes exhibiting differential IR.Moreover,transcripts of almost all genes identified by the KEGG 'Spliceosome' pathway are present at a much higher level in the receptacle at stage 1 when compared with stage2-5.These results indicate that IR may serve as a mechanism for rapid activation of cell division and expansion in fruit initiation upon fertilization.3.Re-annotation of the woodland strawberry V2 genomeDuring the transcriptome data analysis,we noticed that a considerable proportion of genes are misannotated,and the annotation only contains the coding sequence of the protein-coding gene.To improve the annotation,we developed an optimized pipeline to re-annotate the strawberry genome taking advantage of PacBio full-length transcripts and an extensive dataset of RNA-seq libraries.In this pipeline,MAKER2,AUGUSTUS and PASA were used,and then Apollo was employed to manually revise some gene models.We first re-annotated the V2 genome,and new annotation was named v4.0.a2.In v2.0.a2 annotation,13,168(39.3%)protein-coding genes were modified or newly identified,18,641 genes(55.6%out of 33,538 genes)were augmented with information on the 5'and/or 3' UTRs,and 7,370 genes were found to possess alternative isoforms.In addition,1,938 long non-coding RNAs,171 miRNAs,and 51,714 small RNA clusters were integrated into the annotation.4.Re-annotation of the woodland strawberry V4 genomeIn 2018,a high-quality woodland strawberry V4 reference genome was assembled using single-molecule real-time sequencing from Pacific Biosciences(PacBio).While containing 24.96 Mb of new sequences,the number of annotated genes in V4 has been reduced by thousands compared to the previous ones.In addition,the older genome annotations name genes as geneXXXXX,while the new genome implements a new gene naming system FvH4_XgXXXXX.To improve the quality of the V4 genome annotations,we built a new and improved annotation,v4.0.a2.This annotation has a total of 34,007 gene models with 98.1%complete BUSCOs.Gene models of 8,342 existing genes are modified,9,029 new genes are added,and 10,176 genes possess alternatively spliced isoforms.To make use of the previous valuable transcriptome data resources,we established the digital gene expression atlas from 46 tissue types,which are convenient for the researcher to use.Moreover,a total of 84 known and 63 novel miRNAs are identified,and their targets were predicted.Altogether,our work demonstrates that SMRT sequencing is highly powerful in AS discovery and provides a rich data resource for later functional studies of different isoforms.Further,shifting AS modes may contribute to rapid changes of gene expression during fruit set.The new annotations of F.vesca are substantially improved in both accuracy and integrity of gene predictions,beneficial to the gene function studies in strawberry and to the comparatiVe genomic analysis of other horticultural crops in the Rosaceae family.
Keywords/Search Tags:woodland strawberry, alternative splicing, SMRT sequencing, fruit development, genome reannotation
PDF Full Text Request
Related items