Font Size: a A A

Plant MiRNA Mining And Target Prediction

Posted on:2010-05-27Degree:MasterType:Thesis
Country:ChinaCandidate:P Y JiangFull Text:PDF
GTID:2180360302455552Subject:Crop Genetics and Breeding
Abstract/Summary:PDF Full Text Request
The plant microRNAs (miRNAs) sequence with 21~24 nt length are derived from the sequential processing of the hairpin-like pre-miRNA. Interestingly, the pre-miRNA in fact is derived from pri-miRNA with larger secondary structure. The pre-miRNA is processed by the Dicer-like enzyme to form the miRNA::miRNA* duplex with 3’end 2nt overhang in the nucleus then it is finally exported to cytoplasm. And the miRNA* is subsequently digested but miRNA is targeted to the specific mRNA so as to posttranscriptionally down-regulate gene expression mediated by the RISC(RNA induced silencing complex) in the way that the miRNAs repress or cleavage the mRNA. For the plant miRNA biological function, the miRNAs are playing an important role in development, cell differentiation, as well as resistance against stress. In this study, three programs, miRseacher, expr_diff and pick_plant_target were developed. The miRseacher were mainly applied to predict the pre-miRNA based on the SVM using the novel sequence-structure features which made the specificity up to 98.95% and the sensitivity up to 99.19%. The AUC value (Column-wise Area Under ROC Curve) get up to 98.65%, which suggested the miRseacher can be powerful to distinguish between true pre-miRNA and false pre-miRNA. The expr_diff was mainly used to analysis the miRNA expression relied on the Poisson distribution and the solexa sequencing technology. The pick_plant_target was mainly applied to the miRNA target prediction based on the core rule which the miRNAs are nearly perfect complementarity to the targeted mRNA so as to form miRNA::mRNA duplex. In addition, the pick_plant_target had taken the number of mismatches, the position of mismatch in the miRNA::mRNA duplex and the energy of the miRNA::mRNA into consideration.In addition, the next generation deep sequencing technology such as illumina/solexa, 454 etc. provided a new technology to mine the miRNA in the plant even for animal. In this study, we had used the mireap pipeline developed by BGI (Beijing genome institute) to analyze two solexa dataset of Arabidopsis thaliana downloaded from NCBI GEO database respectively called ATH-AGO1(Immunopurified Arabidopsis AGO1 complex, GSM253622) and ATH-AGO2 (Immunopurified Arabidopsis AGO2 complex, GSM253623). The basic ideas of analysis is 1)all the reads produced by the solexa sequcencing technology are mapped to the reference sequence of interest using soap. 2) annotating the all reads mentioned above, i.e., classifying the reads into different small RNA category separately such as mRNA, tRNA, rRNA, repeat, miRNA, siRNA, scRNA, snoRNA, snRNA, unan(unannotated reads) etc. 3) filtering the tRNA, rRNA etc. non-coding RNA. 4) the remaining reads called unannotated reads are subject to be predicted by mireap software. There are 152 and 103 Arabidopsis thaliana miRNAs respectively expressed in the ATH-AGO1 and ATH-AGO2 whereas 83 miRNAs expression level difference is significant. Finally, the solexa-based miRNA mining technology obtained 177 and 22 novel miRNAs never reported separately in ATH-AGO1 and ATH-AGO2. it’s well known that AGO1 of AGO protein family as part of Arabidopsis thaliana miRNA RISC complex whereas AGO2 as part of siRNA RISC complex but in this research shown that some known miRNAs stored in miRbase expressed in immunopurified AGO2 and also a few novel miRNAs were mined out.
Keywords/Search Tags:Support Vector Machine, Plant miRNA, Prediction, Poisson Distribution, Solexa
PDF Full Text Request
Related items