Font Size: a A A

Bioinformatic Analysis Of CircRNA Regulatory Molecules In Human Transcriptome

Posted on:2017-06-05Degree:MasterType:Thesis
Country:ChinaCandidate:X P ChenFull Text:PDF
GTID:2310330503495848Subject:Engineering
Abstract/Summary:PDF Full Text Request
Circular RNAs are a novel group of abundant, stable, and evolutionary conservative noncoding RNAs. Unlike liner RNA which has 5' cap and 3' ploy(A) tail, circ RNA is an RNA molecule with 3' and 5' ends covalently linked in a circle. Latest studies suggest that non-canonical splicing can make exons scrambled to form a circle. And these RNA molecules were found to be evolutionary conservative, stable, and specifically expressed across tissues or developmental stages. In addition, they play different roles in gene regulation. These features make it become the star RNA in this field.Firstly, in this thesis human circ RNA datasets from the reference literature and the previous work of our workgroup were sorted and annotated. Their genomic characteristics and gene enrichment analysis were also studied. Basic information(genomic location) according to the collected circ RNA datasets was extracted. Then, gene annotation GTF file was used to pick up the relevant genomic contents to enrich the dataset. After removing the false positive or redundant dataset, 7804 human genes with their 32914 exonic circ RNAs were obtained. Next, relative biostatistics analysis was performed to explore genomic features and cell type-specific expression of circ RNAs. Studies find that there are more circ RNAs derived from chromosome 1 or 2, and generally containe 2~3 backspliced exons. The process of transcribing circ RNAs may prefer a certain length to maximize exon(s) circularization. Besides, alternative circularization is common in genome and presente a cell typespecific manner. At last, enrichment analysis was conducted for the Ref Seq genes that produced more circ RNAs. Results indicate that as gene transcription products, circ RNAs involve in many important biological pathways, and are closely related to several human phenotypes and diseases.Furthermore, to study the protein-coding potential of circ RNA, the internal ribosome entry sites of this RNA are recognized based on the similarity of the IRES RNA secondary structure. The maximum open reading frames were also predicted. There are 6608 circ RNAs contain both IRES and ORF, approximately 20.08% of all circ RNAs, so they have the potential to encode proteins.Finally, a comprehensive reference database of human circ RNAs was built, named as circ RNADb. Circ RNA dataset, together with its genomic features and protein-coding potential were integrated into circ RNADb. As a comprehensive database, circ RNADb might provide a foundation for further study on circ RNA of the RNA field.
Keywords/Search Tags:circ RNAs, bioinformatics, protein-coding potential, internal ribosome entry site, open reading frame, database
PDF Full Text Request
Related items