Font Size: a A A

Development Of QuaPra And Its Application In Rat Transcriptome Reconstruction

Posted on:2019-03-14Degree:DoctorType:Dissertation
Country:ChinaCandidate:X J JiFull Text:PDF
GTID:1360330596455518Subject:Biochemistry and Molecular Biology
Abstract/Summary:PDF Full Text Request
RNA sequencing(RNA-seq)has greatly facilitated the exploring of transcriptome landscape for diverse organisms.However,transcriptome reconstruction is still challenging due to various limitations of current tools and sequencing technologies.Here,we introduce an efficient tool,QuaPra(Quadratic Programming combined with Apriori),for accurate transcriptome assembly and quantification.QuaPra could detect at least 26.5% more low abundance(0.1 ~ 1 FPKM)transcripts with at least over 2.7% increase of sensitivity and precision on simulated data compared to other currently popular tools.Moreover,around one-quarter more known transcripts were correctly assembled by QuaPra than other assemblers on real sequencing data.The rat is an important model organism in biomedical research but its annotated transcriptome is far from complete.This thesis constructed a Rat Transcriptome Re-annotation named RTR using two transcript assemblers: Stringtie and QuaPra.RNA-Seq data are from 320 samples in 11 different organs generated by the SEQC consortium.RTR is comparable to that of the well-annotated mouse transcriptome.There are 61,582 genes and 130,308 transcripts in RTR.Transcripts and exons in RTR account for ~44% and ~10% of the genome,respectively.Among the newly identified splice junctions,~35% are semi-conservative,resulting in intron retention and exon skipping,which are major mechanisms for generating new splice forms of potential biological significance.Of the 130,308 identified transcripts,37,203 were annotated as high confident novel coding transcripts and 34,718 as high confident long noncoding transcripts.36,564 out of the 37,203 transcripts were annotated with functions.The overlaps between novel transcripts and transposable elements are much more than those between known transcripts and transposable elements.We also found 10,356 genes and 12,905 transcripts were expressed in all 11 tissues and that 748 house-keeping genes expressed different isoforms among tissues.This new rat transcriptome provides essential reference for genetics and gene expression studies in rat disease and toxicity models.
Keywords/Search Tags:RNA-Seq, transcriptome reconstruction, transcript assembly, transcript quantification, rat, BodyMap, functional annotation
PDF Full Text Request
Related items