Font Size: a A A

Identification And Functional Annotation Of Cancer Associated Long Non-coding RNAs Based On CRISPR/Cas9 High-throughput Screening Data

Posted on:2019-11-12Degree:MasterType:Thesis
Country:ChinaCandidate:Y TaoFull Text:PDF
GTID:2370330590976184Subject:Epidemiology and Health Statistics
Abstract/Summary:PDF Full Text Request
Objective With the development of gene sequencing technology,more and more of long non-coding RNAs(lncRNAs)have been identified in the human genome.A large number of studies have shown that many lncRNAs play an important regulatory role in many life activities in human cells.In particular,a large number of abnormally expressed lncRNAs are associated with development and progression of cancer.However,the functions of most lncRNAs in cancer are still unknown,the study focus on functional annotation of cancer associated lncRNAs has become one of the hot topics in the field of life sciences.Owing to abundant high-throughput sequencing data,mature algorithms and efficient computer tools,bioinformatics methods have become one of the most important means for the study of gene function,including identification and functional annotation of lncRNAs.In this study,we predicted cancer associated lncRNAs using statistical models through identifying essential protein coding genes from genomic screening data of CRISPR in various kinds of cancer cell lines.Finally,we annotated the functions of these lncRNAs through constructing regulatory network.Methods In this study,we identified 1231 essential protein coding genes from CRISPR genome wide sequencing data of 69 samples of various cancer cell lines,to predict cancer associated lncRNAs based on protein coding genes(PCGs)—lncRNAs relationship network,which combined the three kinds of relationships: co-expression relationships between protein coding genes and lncRNAs,co-expression relationships between protein coding genes and protein coding genes,and protein-protein interaction(PPI).Then we predicted cancer associated lncRNAs by use of hypergeometric enrichment analysis and restart random walk(RWR)algorithm.Moreover,many bioinformatics tools were used to predict their regulatory biomolecules,including transcription factors,miRNAs,RNA-binding proteins and ceRNA-associated mRNAs,and then annotated the functions of lncRNAs we identified and predicted by performing enrichment analysis of their regulated proteins.Random walk algorithm program was included in DRaWR package,data processing and analysis were performed by Perl and R.Results Thearea under curve(AUC)values of lncRNA predicted based on hypergeometric enrichment analysis and restart random walk algorithm(RWR)to be 0.795 and 0.797 respectively.In this study we identified 279 lncRNAs and predicted 339 lncRNAs associated with cancer,and then predicted their regulatory biomolecules including 453 transcription factors,1653 miRNAs,595 RNA binding proteins and 1067 ceRNA-associated mRNAs.These proteins were mainly enriched in GO terms and KEGG pathways associated with cell metabolism,cell proliferation,cell cycle and breast cancer,non-small cell lung cancer and other diseases.Conclusion In this study,we identified the cancer associated lncRNAs with high reliability,and they may play a key role in various cancer by participating in the regulation biological process such as cell growth,cell differentiation,cell proliferation and so on.
Keywords/Search Tags:Long non-coding RNA, CRISPR screening, Cancer, Functional annotation
PDF Full Text Request
Related items