Font Size: a A A

Identification Of Programmed Ribosomal Frameshifting Genes And Exploration Of The Frameshift Mechanism In Euplotes Octocarinatus

Posted on:2018-01-02Degree:DoctorType:Dissertation
Country:ChinaCandidate:R L WanFull Text:PDF
GTID:1310330521450093Subject:Biochemistry and Molecular Biology
Abstract/Summary:PDF Full Text Request
Protein biosynthesis is an essential life process for all living organism.The ribosome must strictly maintain translational reading frame in order to produce a functional protein.As mRNA shifts,a truncated or nonsense protein will be produced,resulting in an increase in the energetic cost of translation,and additional loads for cellular cleanup and quality control machineries.However,in special cases,the translating ribosome can switch from the initial(0)reading frame to a-1 or +1 reading frame at a specific position,and then continues its translation.This process is called programmed ribosomal frameshifting(PRF).Although the first PRF gene was identified in virus.It has been shown that PRF exists in all branches of life from bacteria to higher eukaryotes.Previous study of several sequenced genes of Euplotes ciliates have suggested that +1 PRF may be more common in these organisms.To understand this unusual case of frameshifting and the molecular mechanisms involved,we conducted a genome-wide investigation of PRF in E.octocarinatus through genome and transcriptome sequencing.We also carried out proteomic analysis to verify these predicted PRF genes.The following results were obtained:1.The macronuclear genome of E.octocarinatus was sequenced by Illumina platform.In total,about 11 Gb data were obtained.After filtering the low quality reads,all reads were used for genome assembly by a meta-assembly strategy.Then we exclude the contaminated bacterial contigs and mitochondrial contigs from the initial assembly by a series of filters.Finally,a total of 41,980 contigs with an average length of 2,117 bp were used as the E.octocarinatus macronuclear genome assembly,and most(70.1%)of them were capped with telomeres on both ends.The completeness of the genome was assessed by three metrics.The assessment results indicated that the E.octocarinatus macronuclear genome encodes all the genes necessary for vegetative growth.The de novo prediction software AUGUSTUS was used to predict complete genes on the non-PRF contigs.Overall,29,076 putative protein-coding genes were obtained,and 90% of them were supported by RNA-Seq reads.About 83% nanochromosomes were predicted containing only one gene.Like other ciliates,the noncoding regions of Euplotes are more AT-richer than coding regions.2.To construct the transcript set,high-throughput RNA-seq(125 bp× 2)of E.octocarinatus growth stage was performed.We obtained 39,478,354 short reads,with a total length of more than 4.9 Gb.Two assemblers were tested and compared to obtain more fulllength transcripts.Finally,we adopted the assembly by Tophat & Cufflinks.In total,32,353 transcripts were generated with a mean transcript length of 1,300 bp.Based on the transcriptome data,a similarity search-based method was used to identify the PRF genes in E.octocarinatus.Overall,we identified 4,690 putative frameshift sites spanning 3,866 transcripts.In addition to these 3,700 +1 PRF genes,we detected 166-1/+2 PRF genes.Subsequently,multiple sequence alignment were performed to distinguish the-1 PRF from +2 PRF,and only five-1 PRF genes were identified.We systematically investigated the hypothetical function of PRF genes.The Pfam protein family database were used to identify functional domains.All putative PRF genes were mapped to the KEGG pathway to investigate the biological pathways where the putative PRF genes may be involved.Furthermore,the PRF genes were also annotated with GO for additional functional interpretation.Functional annotations indicated that the putative PRF genes in E.octocarinatus possessed various functions involved in multiple cellular processes and pathways.A GO enrichment analysis was performed to investigate the functional enrichment of putative PRF genes.Results showed that the identified PRF genes were significantly overrepresented in the regulation of various biological processes such as dephosphorylation,protein amino acid phosphorylation,and ubiquitin-dependent protein catabolic process.3.Total proteins of E.octocarinatus were subjected to large-scale MS-based analysis through shotgun LC-MS/MS.A total of 2,853 proteins were detected,among which 253 were translated via PRF.Furthermore,eight frameshift sites in seven +1 PRF proteins were covered by one or two unique peptides.One of the seven proteins,CUFF.27536.1,provided solid evidence indicating that a single protein was produced by two +1 frameshifting.The amino acid sequences of these peptides suggested that the frameshift occurred at the “T” of the slippery stop codon TAR.Furthermore,both the upstream and the downstream of the frameshift site of 89 PRF proteins were covered by peptides,which providing indirect protein evidence for the presence of PRF in Euplotes.Although we could not obtain peptides spanning the frameshif site of +2 PRF proteins,the frameshift site could be deduced from further sequence analysis.We suggested that the frameshift apparently occurred at the “TA” of the slippery stop codon TAR.4.To search the potential conserved sequence elements,30 bp upstream and downstream of the conserved slippery sequence motif were extracted and analysed.Consistent with a previous report,no conserved sequence element was found except the slippery site sequence.The analysis of the slippery sequences indicated that approximately 92% slippery sequences consisted of an AAA codon followed by a stop codon,and the rest contained other codons preceding stop.Altogether,we observed 47 out of 62 possible sense codons at the frameshift sites.We found that XXX codons were enriched at the +1 frameshift sites(97%)and the XTA codons were enriched at the +2 frameshift sites(73%).Among the five-1 frameshifting site,three of them were AAG TAA.The two remaining were TCT TAA and TTT TAA.There's no stimulatory signals,such as RNA pseudoknots or stem-loop,in the downstream of the-1 frameshifting site.In addition,we analysed the frequency of the stop codon and the tetranucleotide sequence in E.octocarinatus,and compared the usage between the ‘normal' stop codon and the slippery stop codon.Results showed that UAA and UAA-A were preferentially used in both the ‘normal' termination signal and the slippery signal.Moreover,the frequency of UAA codon and UAA-A tetranucleotide sequence in slippery signal are significantly higher than that in ‘normal' termination signal which suggested that they may be favourable for frameshifting in E.octocarinatus.Twelve novel tRNAs with expanded anticodon loop were predicted from the genomic sequences of E.octocarinatus.Further analysis indicated that these tRNAs contain the characteristic internal split promoter,TATA-box and typical termination signal.Small RNA sequencing data indicated that all novel tRNAs,except the Contig34792,are transcriptionally active in E.octocarinatus cell.5.Based on the nucleotide sequence analysis,we found 23 transcripts which contained at least one in-frame stop codon(UAG or UAA)in predicted coding sequences.Sequence analysis and multiple sequences alignment of the Cathepsin B gene indicated that UAA and UAG can either terminate translation or code for glutamine in E.octocarinatus.The MS analysis yielded three peptides of the AMP-binding enzyme family protein which located in both upstream and downstream of the in-frame stop codon.It suggested that these genes with in-frame stop codon are functional,and not simply pseudogenes with in frame stops.6.We constructed the E.octocarinatus Genome Database(EOGD,http://ciliates.ihb.ac.cn/database /species /eo).EOGD includes macronuclear genomic and transcriptomic data,predicted gene models,functional annotations and the taxonomy and morphology information of E.octocarinatus.A series of convenient searching functions,including Gene ID,scaffold ID,keywords,gene sequence and position were provided for user to access the genomic resources.Establishment of EOGD would facilitate research on Euplotes or other ciliates.
Keywords/Search Tags:Euplotes octocarinatus, programmed ribosomal frameshifting, macronuclear genome, transcriptome, mass spectrometry, stop codon reassignment, Euplotes octocarinatus Genome Database
PDF Full Text Request
Related items