Font Size: a A A

Non-Coding Rna Research Based On High-Throughput Sequencing Technology

Posted on:2016-10-21Degree:DoctorType:Dissertation
Country:ChinaCandidate:S Q WangFull Text:PDF
GTID:1220330503976399Subject:Biomedical engineering
Abstract/Summary:PDF Full Text Request
A non-coding RNA (ncRNA) cannot be translated into a protein, while it can implement the regulation function alone or by integrating with other functional molecules. Increasing amounts of sequencing data have been produced with the development of high-throughput sequencing (HTS) technology to reveal the biological function of this molecule. However, the related bioinformatics methods are still limited, though many researches focused on the HTS based ncRNA analysis. In this study, model building, pipeline constructing, and method developing were implemented to carry out a systematic work for ncRNA analysis based on HTS data, and the possible origin of ncRNA was also discussed. Our finding included:(1) A small RNA sequencing analysis pipeline with optimized data visualization was developed for color-space based SOLiD sequencing data, which can implement the miRNA different expression analysis between samples and detect novel miRNAs. This pipeline has been used for the candidate biomarker miRNA genes finding in preeclampsia.(2) Many miRNA isoform products (isomiRs) have been found in the HTS data, which should implement important biological functions. A Shannon entropy-based model was used to estimate isomiR expression profiles from high-throughput small RNA sequencing data. The targets of high variated isomiRs were found to be enriched in genes with cancer-related functions, supporting these molecules should not be randomly produced. The application of this model to the HTS data of Alzheimer’s disease (AD) showed that 47 miRNAs had significant change of isoform level between the early stage and the late stage of this disease, of which 17 miRNA were known AD related genes (P<1.59e-07). It was also fount that the change of 5’isoform level should be more stable for AD related miRNA detecting relative to expression level.(3) In order to detected the functional role of unplaced high quality reads in the HTS data, a pipeline were built to analysis such unplaced sequences in transcriptome data.214 candidate diseased associated sequences with length>200 bp were assembled after removing the contamination and low complexity sequences. Besides, a novel alignment-free method was used to rapidly recognize coding and noncoding transcripts for these sequences.(4) The Hyper-Variable Regions (HVR) of 16S rRNA gene is usually used for the microbial community reconstruction via HTS. However, most of microorganisms can not be identified at the genus level because of the short read. Here, the genus-specific fragment based method was implement to detect more microorganisms at the genus level based on HTS data. Each genus has its own preferential genus-specific amplicons for a genus assignment, using multiple regions rather than one "universal" region based on aligned 16S rRNA sequences could significantly improve the ability of microbial community reconstruction, and obtained more microorganisms at genus level.(5) In addition, the possible origin of ncRNA was also discussed.56 pre-miRNA homology genes were detected by pairwise sequence alignment in Archaea. And 2649 Archaeal miRNA seeds were predicted by binomial distribution model. The statistical significance of the overlap between the detected Archaeal seeds and known eukaryotic seeds shows the miRNA may evolve before the divergence of these two domains of cellular life. For the first time the increasing of functional non-coding RNA was found to be associated with the increasing of compositional symmetry based on the intra-strand partial symmetry analysis of animal whole genome sequence, which explained to some extent the phenomenon that the number of functional ncRNAs increases with the complexity of organism.
Keywords/Search Tags:miRNA, bioinformatics, functional non-coding RNA, 16S rRNA, high throughput sequencing, isomiR, lncRNA, RNA-seq, quality control
PDF Full Text Request
Related items