Font Size: a A A

Genome-Wide Analysis Of RNA Trans-splicing

Posted on:2011-05-17Degree:DoctorType:Dissertation
Country:ChinaCandidate:Z X ZuoFull Text:PDF
GTID:1220330332982983Subject:Genetics
Abstract/Summary:PDF Full Text Request
The primary transcripts of eukaryotic genes, called premRNAs, encode both intronic and exonic sequences. Formation of mature mRNAs from premRNAs need removing intronic sequences and joining exonic sequences. When the process of intronic removal occurs within the same pre-mRNA molecule, it is termed RNA cis-splicing. Splicing between two separate pre-mRNA molecules, termed RNA trans-splicing, can occur, but at an extremely reduced rate. Although study and understanding of trans-splicing are still lagging far behind cis-splicing, trans-splicing has drawn general attention because of its great value in gene therapy. The most gene therapies use DNA as targets, while trans-splicing based gene therapies use RNA as targets. RNA gene therapies are more effective and efficient than DNA gene therapies. Therefore, trans-splicing based gene therapies are being increasingly popular.Trans-splicing is found mainly in low eukaryotic organisms such as kinetoplastids, nematodes etc. Recently some cases of trans-splicings are also reported in higher eukaryotic organisims including human and mouse, but it is believed that trans-splicing is rare in high eukaryotic organisms, which could be ignored when compared to cis-splicing. Several studies observed lots of chimerics in human public sequence databases. These chimeric sequences are regarded as chromosome translocation but not trans-splicing because the rare occurrence of trans-splicing in high eukaryotic organisims and the high incidence of translocation in some tumor tissues.Abundant-32bp short RNA-Seq sequences are accumulating in public sequence database as the next generation sequencing technology developed. We made use of both long EST sequences and short RNA-Seq sequences to detect trans-splicing at genome-wide level. In this way, not only the artifactial chimerics could be excluded, but aslo chromosome translocation chimerics, because both artifactial chimerics and chromosome translation chimerics were generated by accidental ligation of different cDNAs or different DNAs and the possibility of occurrence in different individuals or experiments is very low. Howerver, trans-splicing is a natural splicing machanism could be supported by both EST and RNA-Seq sequences. Through the cross-validation by sequences from different sources, the predicted trans-splicing with high reliability could be obtained.The finding of disrupted tRNAs in archaeas suggested that trans-splicing not only occurred in mRNAs but also in other RNAs like tRNAs. We analyzed the possible distrupted tRNAs in all the three life domains including archaeas, bacterias and eukaryotes and discussed the evoluation relationship between disrupted tRNA genes and continuous tRNA genes.The major results of our research are listed below:1. Genome-wide analysis of trans-splicingEST/mRNA sequences from public sequence database were used to initial identification of chimerics. We obtained 53,074 and 19,151 candidate chimerics in human and mouse, resepectively. After remove the redundant, chimerics,45,942 in human and 15,315 in mouse were obtained. The RNA-Seq sequences were then used to fliter the trans-splicing chimerics. Finally, we obtain 1235 and 1129 chimerics which could be validated by RNA-Seq sequences, in human and mouse respectively. We found that these trans-splicing chimerics have obvious chromosome distrubution bias when compared to those before filtered by RNA-Seq sequences. For example, in human, chromosome 21 has abundant trans-splicing chimerics while trans-splicing chimerics distribute rarely in choromosome 5,10 and 14.We peform an initial analysis and discussion about the possible splicing machanism of trans-splicing from the predicted trans-splicing sequences pools. There are 3 ways for the junction of upstream transcript and downstream transcript of one sequence, they are overlapped junction, exact junction and gapped junction. Most of the trans-splicings are overlapped junction.We found that 64% and 66% of trans-splicing in human and mouse could encode new proteins, which suggested that trans-splicing could contribute to protein diversity as cis-splicing did. At the same time, we did a function annotation of those trans-splicings who were formed by known genes. It is suggested that ribosomal protein, nucleic acid binding and cytoskeletal protein are inclined to underway trans-splicing while trans-splicing is less common in receptor molecules in human.We anlyzed the relationship between chromosome interaction and trans-splicing. We observed that the intrachromosomal trans-splicing events are higher than interchromosomal ones. It is known that intrachromosomal interaction is easier than interchromosomeal interaction. So we could infer that chromosome interaction could impact trans-splicing, trans-splicing are inclined to occur between those genes whose chromosomes are close to each other in the cell. We used Hi-C data further confirmed this observation.We chose a part of trans-splicing RNAs from human and mouse to validate them through RT-PCR. Results are:At last the trans-splicing data were imported into a relational mysql database TRSDB, and we used PHP develop a web interface for TRSDB. One could browse all the trans-splicing data including chromosome location, gene annotation and splicing information through the TRSDB web interface.2. The evolution of tRNA:from disrupted genes to continuous genesWe first performed a deep analysis on the distribution of single-nucleotide substitutions around tRNA genes in different genomes including 43 archaeas,42 bacterias and 8 eukaryotes. We detected tRNA genes enriched nearby regions with single-nucleotide substitutions within and between species suggested that tRNA genes are insertion-derived sequences, which was inferred from the "indel-associated substitutions" mutational mechanism. We then seek to find out the insertion evidence from the evolutionary history of vertebrates and insects. As expected, there were about 18 candidate insertions found in the 27-way whole genome alignments of vertebrates and about 5 candidate insertions observed in the 14-way whole genome alignments of insects.We then performed a blat search in different genomes including human, chimpanzee, rhesus, marmoset, mouse and D.melanogaster, using tRNA sequences as queries, to find if there are tRNA pieces present in these genomes. tRNA half homologs are universally observed in different genomes. We performed the same "single-nucleotide substitution" analysis on tRNA half homologs as we did on tRNA genes. However, the single-nucleotide substitutions nearby tRNA half homologs did not increase obviously as those nearby tRNA genes. So it seems more plausible that half-tRNA-like sequences are ancient. We found that tRNA half homologs are retrotransposon-related elements. The expression of tRNA half homologs could be detected using CAGE tags and RT-PCR.Our results suggested that modern tRNA genes first throughout genome as pieces, then the mature tRNA molecules, formed by ligating the tRNA pieces at the RNA level, can insert into genome and gradually become stable by subsequent evolution.
Keywords/Search Tags:Trans-splicing, EST, RNA-Seq, Genome, tRNA, tRNA half, Evolution
PDF Full Text Request
Related items