Font Size: a A A

Study On Systems Biology Methods Of MiRNA Target Prediction

Posted on:2010-11-28Degree:DoctorType:Dissertation
Country:ChinaCandidate:H LiuFull Text:PDF
GTID:1100360278461416Subject:Communication and Information System
Abstract/Summary:PDF Full Text Request
microRNAs(miRNAs) are a class of single-strand non-coding RNAs with approximately 22 nucleotides in length. miRNAs play important regulatory roles in post transcription stage by either degrading mRNA or inhibiting translation and thought to relate to many biological processes and deaseses. Identifying targets of miRNA is a key to understand the mechanisms of regulatory function of miRNA, howere, it become the bottle neck of miRNA research because of the laking of effective biological experiments and high accuracy prediction algorithms. Given this fact, this thesis works on a systems biological approach, which integrates sequence level information, gene expression information, protein interaction information and other biological prior knowledge, aiming to contribute to miRNA target identification.Firstly, an abroad and thorough research is carried out to survey a large number of existing computational miRNA target prediction algorithms. Basing on the investigation of the mechanisms of each algorithm the performance evaluation of several algorithms, shortages of existing algorithms are sumrized. The possible research directions are point out as well.Sencondly, a two-stages SVM based algorithm SVMicrO is proposed for target prediction in sequence level. A large amount of positive and negative samples are carefully derived from the most up-to-date literatures for building training and evaluating dataset. Based on statistical characteristics discovered by former researches as well as our understanding of miRNA:Target interactions, 113 and 30 noval features are extracted for constructing Site-SVM and UTR-SVM respectively. mRMR and SFS are used for feature selection. Sample weight as well as class weight is introduced into SVM to deal with the imbalanced dataset. To compare the performance, SVMicrO and several other popular algorithms are evaluated based on the results of high confidence target identification experiments. The simulation results show that the SVMicrO can produce better performence and generalization capacity.Thirdly, considering mRNA has the function of mRNA degradation, this paper perform a deep research on gene expression profiling in miRNA over-expression experiments and built a Bayesian inferring model based on microarray data and sequence level prediction. In this model, Logistic Regression model is used to map SVMicrO prediction result to probability space and a Gaussin Mixture Model, whose parameters are estimated by VBEM algorithm, is built to model gene expression profiling data. The evaluation results indicate that the proposed algorithm, that integrates tow types of information, outperforms sequence-based prediction and prediction based expression data alone.Fourthly, given the fact that the primary function of miRNA is translation inhibition, an algorithm called SysMicrO, which raises a new concept of miRNA target prediction algorithm, is proposed based on the noval causality hypothesis which is"Pathway→Transcription factor→regulation gene set"by considering the regulation relationship between miRNA and protein as well as protein and protein. The transcription factor regulated geneset and transcription factor upstream regulation network database are constructed as the first step for carrying out the algorithm. For predition, GSEA is used to detect the enriched genesets which indicate the related transcription factors are affected by the miRNA. Finally prediction results are listed out by calculating the intersection of transcription factor upstream genes and SVMicrO predicted results. The evaluation results show that not only can SysMicrO improve the specificity by screen out the SVMicrO prediction, but also can provide the annotation and biological explaination which is meaningful information for subsequent analysis and target identification.
Keywords/Search Tags:microRNA(miRNA), target prediction, Support Vector Machine(SVM), Bayesian inferring, Gene Set Enrichment Analysis(GSEA)
PDF Full Text Request
Related items