Circular RNA(or circ RNA), are special molecules that contain exons from unusual splicing results – a downstream splice donor unconventionally spliced to an upstream splice acceptor site. Unlike linear RNA, circ RNA is formed by a covalently closed continuous loop which has been proved to be more stable than associated linear m RNAs. With the end of circ RNA joining together, circular RNA lack important characteristic molecular ‘tails’. Thus they have generally been overlooked in typical RNA-sequencing. Now, with the development of new bioinformatic approaches and deep sequencing techniques, more and more researches focus on studies of circular RNA species. Recent studies have revealed that circ RNAs involve in regulating gene expression post-transcriptionally. Besides, many other functions of circ RNAs are still unknown. Therefore identification of circular RNA is essential for comprehensively understanding its regulating function in complex gene expression system and the molecular mechanisms of cell functions. Neuroblastoma(NB) is the most common extracranial solid cancer in childhood and the most common cancer in infancy. In this thesis, our emphasis is placed on identifying the circular RNAs in couple of cell lines of neuroblastoma. The main research works and innovative results were summarized as follows:We firstly developed a systematic bioinformatic and statistical pipeline algorithm that efficiently identify circular RNA in human cells after completing a comprehensive analysis and comparison among various high-throughput RNA-Seq alignment tools. Identification of circular RNAs was based on high-throughput RNA-Seq data from cell lines of neuroblastoma including CHLA, COG, SK-N-BE and SMS. We built a database of all scrambled junctions between annotated exon boundaries. Pair-end read pairs were aligned to human genome(hg19, ENSEMBL) and this custom database, both under Bowtie2 default condition.However, experimental and bioinformatic noise can give rise to spurious evidence of circular transcripts. The candidates might still contain reads from linear RNAs. Thus a further work—False Discover Rate(FDR) controlling was accomplished under RStudio. This work was necessary for guaranteeing that the probabilities of read pairs from linear RNAs were small enough to be neglected. All circ RNAs from 14 samples of neuroblastoma were achieved after redundancy elimination.At last, relative biological statistics analysis was performed after identification of circ RNAs. In this thesis, we mainly performed statistics and analysis on chromosome and gene level.We found that drugs have impact on producing of circ RNA for chromosome. However this situation is opposite for genes. Splice site selection of Circular RNA isoforms is similar to that of linear RNA. In addition, Gene Ontology analysis was completed for exploring the function of genes in biological process, molecular function and cellular component. After statistics analysis above, a related research work about circ RNAs has been done. |