Font Size: a A A

Computational Study Of MicroRNA Based On Sequence Analysis

Posted on:2013-03-04Degree:DoctorType:Dissertation
Country:ChinaCandidate:J D DingFull Text:PDF
GTID:1220330395951178Subject:Computer software and theory
Abstract/Summary:PDF Full Text Request
Small non-coding RNA (sncRNA) is a general term applied to a broad class of short non-coding RNA (ncRNA) that whose length is20-30nt and will not translated into a protein product. According to their biogenesis and functions. sncRNA could be mainly divided into three categories:short interfering RNA (siRNA), microRNA (miRNA) and PIWI-interacting RNA (piRNA). These sncRNAs participate in the translational regulation, and can also regulate genes posttranslationally. thus obtain an important role in growth and development. In the last decade, these tiny molecules received wide attention. In this disseration. I studied the biogesis and function of siRNA and miRNA using advanced bioinformatics approaches, like machine learning and pattern recongnation. My work is mainly composed by bellowing four parts:1. We proposed a new pre-miRNA identification method. By extracting and selecting those most appropriate features, miRenSVM could precisionly identify pre-miRNA whose secondary structure contains multiple loops. And with an ensemble classifier, we successfully solved the onging class imbalance problem in pre-miRNA study. Compared with other three methods, miRenSVM achieved a better performance. Furthermore, performance evaluation was conducted over27additional species in miRBasel3.0, and92.84%(4863/5238) animal pre-miRNAs were correctly identified by miRenSVM.2. In order to efficiently manage and rationally use known miRNAs, we developed a supervised classification based method to aotumatically arrange miRNA families. Relying on primary sequences, miRFam could translate them into numerical features by n-gram and classify miRNA into different families with multi-class SVM algorithm. Compared with traditional sequence alignment methods, it always obtained better efficience and accuracy. After a series test, miRFam proved that its speed and accuracy could meet the requirement of real application, thus could greatly reduce the human and material cost.3. By integrating four exsiting methods, we developed the first intergrated method to predict miRNA targets for Arabidopsis Thaliana. After more detailed studies, the mechanisim that miRNA regulates its target seems to be more complicated than we used to know. It’s not sufficient to solve the target prediction issue with exsiting methods. Learning form successful experience of animal miRNA target prediction, we carefully chose four popular methods to be intergrated into imiRTP, and we also defined four filters to select higher quality miRNA targets. 4. We developed a novel TAS gene identification pipeline and propsed a new model to explain the releationship between TAS2and some Arabidopsis Thaliana PPR genes. The development of next generation sequencing technology greatly facilitates genomics studies. Based on several Illumina small RNA sequencing datasets, and also with pattern recognition methods, we developed the pipeline ta-siRoot. and then use it to explore Arabidopsis Thaliana TAS gene at the genome-wide scale. Result shows that ta-siRoot is more reliable and accurate that traditional statistics based methods. Based on prediction result, we further studied the mechanisim that primary ta-siRNAs trigger secondary ta-siRNAs. and proposed a secondary two-hit trigger model.
Keywords/Search Tags:microRNA, ta-siRNA, Identincation, Prediction, Target
PDF Full Text Request
Related items