Font Size: a A A

Genome-wide Identification Of Conserved Translating Small Open Reading Frames In Early Drosophila Embryos

Posted on:2017-12-22Degree:DoctorType:Dissertation
Country:ChinaCandidate:H M LiFull Text:PDF
GTID:1360330590491088Subject:Biology
Abstract/Summary:PDF Full Text Request
Accurate annotation of the protein-coding sequences within the genome is essential to understand how the genetic information is ultimately translated into biological functions.However,to date,the overwhelming majority of this annotation has focused upon open reading frames(ORFs)longer than 100 amino acids(aa),owing to the considerable technical challenges encountered in the identification of the ORFs that are smaller than 100-aa,the small ORFs(sORFs).Still,recently,several sORFs have been identified to play fundamental roles particularly in growth and development.There is thus growing recognition that it is absolutely imperative that this vastly unknown portion of the translatome be identified and validated for an understanding of basic biological processes.In this work,we provide the first comprehensive annotation of the translating sORF population that is present during early Drosophlia embryogenesis.Combining ultra-deep translatome sequencing of ribosome-associated RNA and amino-acid conservation bioinformatics analyses(PhyloCSF),our data essentially doubles the number of known sORFs in Drosophila by identifying 399 translating sORFs during the first 4 hours of embryogenesis,the critical period during which control shifts from maternal-to zygotic-encoded transcripts.These sORFs include 128 sORFs previously annotated but lacked direct translational evidence,22 sORFs found within transcripts previously believed to be non-coding,and 45 sORFs not previously known to be transcribed.We find clear translational support for many sORFs with different isoforms,suggesting that their regulation is as complex as longer ORFs.Finally,we provide direct validation of the translational capacity for randomly selected sORFs using an Enhanced Green Fluorescent Protein(eGFP)-tagged assay,which reveals widely different cellular distributions for the sORF-encoded peptides in S2R+ cells,likely attesting to their wide range of biological functions.We also found that 201 sORFs translated in the early embryo are not present in the late-stage Drosophila S2 cells,suggesting that many of the translated sORFs have stage-specific functions during embryogenesis.In addition,according to the expression patterns in different developmental stages,these translating sORFs classified into seven clusters and each cluster showed distinct gene ontology enrichment terms.The innovation of this paper are:(1)provide the first comprehensive annotation of the translating sORF population that is present during early Drosophila embryogenesis,espeically the novel identified sORFs located in lncRNA and assembled novel transcripts,a necessary resouce for understanding of their function in Drosophila embryogenesis and other biological processes;(2)setup a method to identify conserved translating sORFs: ribosome-associated as experimental translating evidence coupled with high amino acid conservation(PhyloCSF analysis)as poteitial function of the encoded peptides;(3)establish a stranded-specific RNAseq library construction method with simple operations,and design the eGFP-fused sORF constructs to validate the translational capability of sORFs in vivo.Overall,this paper provided the first comprehensive identification of translating sORFs in 0-4 hr Drosophila embryo,by combining deepsequencing of ribosome-associated RNA and amino-acid conservation analysis together,which offers a reliable source for further understanding the early Drosophila embryogenesis and other biological processes.
Keywords/Search Tags:small open reading frames, sORFs, PhyloCSF, translatome, early Drosophila embryo
PDF Full Text Request
Related items