Font Size: a A A

Implementation And Integration Of CircRNA Analysis Pipeline Based On Nextflow

Posted on:2021-04-20Degree:MasterType:Thesis
Country:ChinaCandidate:C H WeiFull Text:PDF
GTID:2370330611967345Subject:Computer technology
Abstract/Summary:PDF Full Text Request
Circ RNA(circular ribonucleic acid)is a special non-coding RNA in small circular molecule.It is also the latest research topic in the field of RNA.Through the interactions with diseases associated micro RNA,the naturally generated circ RNA can affect gene expression.Circ RNA plays an important regulatory role in many research areas,such as the occurrence and development of diseases,biological pathways,body growth,and cell resistance to the external environment.In recent years,in order to search circ RNA precisely and comprehensively,researchers have continuously created several circ RNA prediction methods based on counting the RNA sequencing reads.However,the performance of those algorithms is quite different.For example,Mapsplice and CIRCexplorer2 can not detect the circ RNA sequenced by de novo.Segemehl not only demands long running time and large memory consumption,but also has a high false-positive detecting rate.How to get a complete and accurate circ RNA analysis report is an urgent technical problem.By integrating authoritative circ RNA prediction software,we can obtain the entire circ RNA results.Based on this,the main contents of this paper are as follows:First of all,this paper briefly described the background,current research situations,and significance of circ RNA detection.Secondly,this paper introduced the principle algorithms of different detection tools.We compared the advantages and disadvantages.Thirdly,this paper proposed an integrative circ RNA analysis pipeline based on the Nextflow framework,using the strong coupling of process and channel modules.This pipeline could automatically execute each process according to the data flow,so as to reduce the tedious manual operations.Meanwhile,the specific circ RNA filters and recombination strategies were also designed in the pipeline.In order to make a comprehensive circ RNA elaboration,we added the statistical bioinformatic analysis methods and biological pathway enrichment analysis specified for circ RNA.Finally,we not only evaluated the pipeline performance on real data sets,but also clarified its comprehensiveness and superiority by comparing with other methods.In conclusion,the main achievements of this paper are as follows:(1)This paper drew lessons from the advantages of five circ RNA detection tools,and integrated them into a newcirc RNA analysis pipeline using the Nextflow framework.This pipeline maintained the characteristics of strong coupling,one-click execution,and automatic parameters configuration,which could reduce the operational complexities and enhance the circ RNA analysis efficiency.(2)Unlike other single pipeline,this integrative circ RNA pipeline included not only the specific circ RNA filters and recombination strategies,but also the bioinformatics analysis methods,which lead the pipeline to a comprehensive circ RNA interpretation.(3)In addition to theoretical explanations,this paper used real data sets to test the pipeline rationality and effectiveness.The biological verification showed that this pipeline had a positive influence on the circ RNA studies and played an important role in cancer processes.For exmaple,the predicted circ RNAs influenced the pathway of tumor metastasis targets and the cancer reclassification.
Keywords/Search Tags:Circ RNA, Nextflow, Pipeline Integration, Bioinformatics Analysis
PDF Full Text Request
Related items