Font Size: a A A

Bioinformatics Tool Development And Database Construction For Translatable Circular RNAs

Posted on:2022-07-11Degree:MasterType:Thesis
Country:ChinaCandidate:P S SunFull Text:PDF
GTID:2510306341974499Subject:Bioinformatics
Abstract/Summary:PDF Full Text Request
Circular RNA(circRNA),which plays an important role in many biological regulation pathways,is widespread in many species.It' s a kind of RNA molecule with a circular structure that has no 3' and 5' ends and is hard to be degraded by RNase.It can perform biological functions by the interaction between microRNAs or small molecule proteins,such as acting as microRNA sponges.Some of the circRNA sequences are highly conserved.Nowadays,it has been discovered that more and more circular RNAs have translation capabilities,they can produce proteins in cells through translation and play an important role in biological processes such as the growth and development of organisms and immune responses.The correct identification of circular RNAs with translational potential is of great significance for further research.However,there is currently a lack of software to identify its translation potential.Nowadays,with the development of high-throughput sequencing technology and the emergence of ribosome profiling technology,it is possible to identify the coding ability of circRNA with high sensitivity.Here,we developed software to identify the translation potential of circular RNA based on ribosome profiling data.To test the performance of CircCode,we collected circRNAs datasets from Arabidopsis and humans and their ribosomal profiling data as a test.Finally,we found 4651 translated circular RNAs in humans and 371 translated circular RNAs in plants by using CircCode.CircCode is a pipeline based on Python 3 for the identification of translated circular RNA(circRNA).It is also a simple and powerful command line-based software.The user only needs to fill in the given configuration file and run the python script to get the predicted translated circular RNA.It can identify circular RNAs with translation capability from a given candidate circRNA database with high accuracy.The software has been released on the code repository GitHub(https://github.com/PSSUN/CircCode)for free use by scientific researchers.Besides,for the downstream analysis of circRNA and the visualization of its data,this research has developed a third-party package 'Rcirc' based on the R language.Rcirc is an R language-based for various downstream analysis of circRNA,and R package for visualization.Based on CircCode,Rcirc can not only predict the translation ability of circRNA.It also contains important functions such as circRNA identification and feature analysis(including individual features and collective features).For circRNAs with translation potential,we have performed a unique visualization of the matching of reads at the junction site,which makes it possible to see the matching of all reads near the junction site with the help of Rcirc,including the distribution of reads and the area covered by reads The type of each base,as well as the highlighting of the start codon and stop codon,etc.Also,with the help of optional parameters,functions such as zooming in and zooming out of the area can be realized.The specific usage method is published on Rcirc's online manual website(rcirc-doc.readthedocs.io/en/latest/),and the R package is also Stored in the GitHub code repository(https://github.com/PSSUN/R circ).Finally,based on the translation-capable circRNA data identified by CircCode,we established the relevant translation circRNA database TransCircDB(transcircdb.com),which stores humans,mice,rats,chickens,Arabidopsis and the sequence information and location information of the translated circRNA of more than ten species such as maize and rice are available for free download and use by the majority of scientific researchers.At the same time,TransCircDB supports a variety of online bioinformatics analysis,including the identification of circRNA,the identification of translation capabilities,and the mechanism of ring formation.Online tools such as the sorting of sequencing data,the visualization of sequencing data comparison,and other online tools can directly submit data on the web page to complete routine analysis.All the above services are provided free of charge.
Keywords/Search Tags:circular RNA, bioinformatics, ribosome profiling, machine learning, translation, data visualization, NGS, python, R
PDF Full Text Request
Related items