Font Size: a A A

Bioinformatics Study Of Long Non-coding RNAs In Ovarian Cancer

Posted on:2020-08-25Degree:DoctorType:Dissertation
Country:ChinaCandidate:J ZhaoFull Text:PDF
GTID:1484306494969399Subject:Precision instruments and machinery
Abstract/Summary:PDF Full Text Request
Ovarian cancer is one of the most common reproductive cancers in women,with the highest mortality.Since ovarian cancer symptoms aren’t apparent in the early stages,approximately 60% of ovarian cancers are found at the advanced stage.Therefore,early detection of ovarian cancer is key to survival.In addition,due to the high recurrence and metastasis rate and the chemotherapy resistance,ovarian cancer – especially ovarian serous cystadenocarcinoma – carries the poorest prognosis among all gynecological malignancies.In order to develop new methods of early detection and clinical treatment for ovarian cancer,it is needed to understand the molecular mechanism of the occurrence and development of ovarian cancer and its drug resistance.A large number of dysregulated long noncoding RNAs(lnc RNAs)in ovarian cancer have been found by high-throughput sequencing technology in the past decade,but the functions and molecular mechanisms of most lnc RNAs still remain elusive.Due to the spatiotemporal expression and tissue specificity and diverse functions,lnc RNA has become a hotspot in the field of ovarian cancer research.In recent years,with the accumulation of ovarian cancer transcriptomic data and the implementation of the Cancer Genome Atlas(TCGA)program,we have been able to identify the ovarian cancer associated lnc RNA and construct the regulatory network of lnc RNA through systems biology and bioinformatics approaches,and to further investigate the functions and molecular mechanisms of these lnc RNAs in ovarian cancer.The major innovative works of this thesis are as follows:1.Due to the poor RNA-Seq assembly quality and the loss of the start or stop codon,incomplete protein coding transcripts are more likely to be misclassified as lnc RNAs.Therefore,we proposed a novel lnc RNA identification tool--lnc Score.This tool is superior to other tools(such as CPAT,CNCI,etc.)in accurately distinguishing lnc RNA and m RNA,especially in the classification of incomplete protein coding transcripts,with a recognition accuracy of more than 95%.In addition,lnc Score also has the advantages of supporting multi-threading,short-run consumption and high efficiency.Further,we performed Total RNA-seq on an Illumina platform for 3 patients with ovarian cancer for which both tumor tissue and adjacent noncancerous tissue were available.And then based on the RNA-Seq data,5821 novel lnc RNA transcripts and 4611 novel lnc RNA genes were identified from the assembled transcripts by using lnc Score.In final 10 novel lnc RNA transcripts and 174 novel lnc RNA genes were found to be differentially expressed in ovarian cancer.2.As the existing computational approaches for identifying competing triplets(lnc RNA-mi RNA-m RNA)are all based on the global correlation,their performances are greatly affected by the sample set.In addition,they can only identify mi RNA-centric triplet candidates.Therefore,we proposed a new competing triplet identification tool--Lnc Mi M.This tool uses an improved sliding window method to screen three centric types of triplet candidates with local expression correlations,which not only reduces the false positive rate,but also improves the sensitivity of recognition.Based on the 373 pairs of RNA-Seq and mi RNA-seq data of ovarian cancer in the TCGA database,we used the competing triplets identified by Lnc Mi M to construct the lnc RNA regulatory networks and analyze their functions.The results showed that the regulatory network was closely related to the proliferation,division and migration of ovarian cancer cells.3.The internal ribosomal entry site(IRES)elements in RNA commonly mediate the cap-independent translation mechanism,and they have recently been found to play important roles in the tumorigenesis and progression.Therefore,a comprehensive IRES database is urgently needed.Here,we manually collected all the experimentally verified IRES elements from the literature and built an IRES database--IRESbase.This database contains 1184 IRES elements in total,more than eight times the number in other databases(e.g.IRESite and IRESdb).In addition,the annotation information in IRESbase is more abundant,especially the genomic locations of human IRES elements.Based on the RNA-Seq data of ovarian cancer in the TCGA database,we further calculated the expression correlation between lnc RNAs and the IRES host m RNAs in IRESbase.In final,110 lnc RNAs were found to interact with 159 m RNAs containing IRES element,and the functional analysis result showed that these lnc RNAs may influence ovarian cancer cell proliferation by regulating cell cycle and metabolic processes and influence the migration of ovarian cancer cell by regulating Slit/Robo signaling pathway.4.A large number of IRES elements in human RNAs have not been found up to now,and the experimental method is often time-consuming and labor-intensive.Therefore,we proposed a novel IRES identification tool--IRESfinder.The positive and negative samples in the training dataset were all derived from the IRES verification experiments,and an improved k-mer feature--framed k-mer was firstly used for classifying IRES elements.Compared with existing tools,IRESfinder classifies eukaryotic IRES elements with higher accuracy and stronger robustness.Based on the Total RNA-seq data of ovarian cancer and tissues adjacent to carcinoma,23 differentially expressed lnc RNA transcripts were found by using HISAT2,String Tie and Ballgown.And then IRESfinder was used to identify potential IRES elements in the 23 differentially expressed lnc RNAs in ovarian cancer.Finally,with the predict result,3 lnc RNAs that may encoding multiple small peptides were found,and functional analysis indicated that they were closely related to ovarian development.In this thesis,with the transcriptome sequencing data of ovarian cancer,lnc Score was used to identify novel lnc RNAs in ovarian cancer,then Lnc Mi M was used to construct the lnc RNA-mi RNA-m RNA regulatory network,and then the potential functions of lnc RNA in ovarian cancer were analyzed by studying the interaction between lnc RNAs and host m RNAs of IRES elements in IRESbase.Finally,IRESfinder was used to assist the identification of ovarian cancer associated lnc RNAs with potential small peptide coding ability.The results gained in this thesis will help to understand the molecular mechanism between the lnc RNA regulatory network and the occurrence and development of ovarian cancer,and provide valuable scientific clues for the early diagnosis of ovarian cancer and the development of targeted drugs.
Keywords/Search Tags:Ovarian cancer, lncRNA, lncScore, IRESbase, IRESfinder, Competing triplet
PDF Full Text Request
Related items