Font Size: a A A

Study On The Identification And Calculation Method Of Viral CircRNA

Posted on:2022-08-29Degree:MasterType:Thesis
Country:ChinaCandidate:Y S FanFull Text:PDF
GTID:2480306731490944Subject:Biology
Abstract/Summary:PDF Full Text Request
Viruses are small non-cellular organisms composed of nucleic acids and(or)proteins.They rely on cells for survival.Circ RNAs are non-coding RNA molecules with a closed circular structure,which are rich in miRNA binding sites on their surfaces,and play an important regulatory role in the progress of diseases.With the rapid development of high-throughput sequencing technology,the study on the expression and function of circRNAs in different species has become a new hotspot.Currently,several viral circRNAs associated with cancer have been identified in double-stranded DNA viruses(dsDNA),but there is no systematic study on the circRNAs encoded by viruses.In this thesis,bioinformatics method was used to systematically identify viral circRNAs and explore the factors affecting its identification.The main findings of the thesis were listed as follows:(1)Systematic identification of circRNAs encoded by viruses.The viral infection related RNA-seq datasets with rRNA removed or RNase R-treated were collected manually from public databases.They were analyzed by the bioinformatics methods of circRNA identification.A total of 11,924 circRNAs encoded in 23 viruses from 15 viral families were obtained.Besides the dsDNA viruses,lots of circRNAs were identified in single-stranded RNA viruses and retro-transcribing viruses,such as the Zika virus,the Influenza A virus,the Zaire ebolavirus,and the Human immunodeficiency virus 1.Analysis of the viral circRNAs showed that the number of circRNAs encoded in viruses varied much;most viral circRNAs had low abundance;the length of 70% of viral circRNAs ranged from 200 bp to 1 kbp,and no significant differences were observed between the length of viral circRNAs encoded in ssRNA and dsDNA viruses;the number of viral circRNAs encoded by viruses with linear and circular genomes had no significant differences.To facilitate the usage of the viral circRNAs identified above,the first viral circRNA database named VirusCircBase was built which was public available at http://www.computationalbiology.cn/Viruscirc Base/home.html.The database provided the functions of access,browse and download of viral circRNAs,and also the interactions between viral circRNAs and host miRNAs.(2)Systematic analysis of the internal structures and expression heterogeneity of viral circRNAs.Analysis showed that there were alternative splice events in viral circRNAs and the frequencies of different kinds of alternative splicing varied much;no location preference of viral circRNAs were observed in the viral genomes,as most viral circRNAs were randomly located in viral genomes;most viral circRNAs only expressed in some specific cell or tissue;most circRNAs expressed in the middle and late stage of viral infections.(3)Investigation of the factors influencing the viral circRNA identification.More viral circRNAs were identified from the RNase R-treated RNA-seq datasets than those from the rRNA-removal datasets;whether or not to remove the host sequences had little influence in the identification of viral circRNAs;the viral circRNAs identified by CIRI2 were also detected by other two methods,suggesting the robustness of the method.In conclusion,this work is the first to systematically identify the circRNAs encoded by viruses and to analyze the influencing factors for the identification of viral circRNAs.The work will help us further understand the important role of circRNAs in the viral life cycle,and also provide a reference for the identification of viral circRNAs.
Keywords/Search Tags:Bioinformatics, virus, circRNA, expression heterogeneity, strategy of identification
PDF Full Text Request
Related items