Font Size: a A A

Research Of RNAi Efficiency Based On Machine Learning

Posted on:2009-11-27Degree:MasterType:Thesis
Country:ChinaCandidate:L ChaFull Text:PDF
GTID:2178360278956779Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
Since the identification of RNA-mediated interference (RNAi) in 1998, RNAi has become an effective tool to inhibit gene expression. It has been widely applied in gene functional analysis, and as a potential therapeutics strategy in viral diseases, drug target discovery, and cancer therapy. The key of RNAi is to design siRNA with high efficiency to its target gene. Although many programs have been developed for this purpose, it is still a challenge work to design a high efficiency siRNA. Here we try to construct a computational model incorporating multiple factors and analyse the relationship between RNAi efficiency and mRNA secondary structure.With the growth of experimental data, it's has become a reality to analyse RNAi efficiency using machine learning methods. To develop computational models for prediction of siRNAs, we firstly construct a training dataset and a test dataset from siRecords database. For each sample in training dataset, we use the features from nucleotide, sequence, thermodynamic property, and secondary structure to describe it. Then, the technique of RNA secondary structure profile and Na?ve Bayes method are used to study the relationship between RNAi efficiency and mRNA secondary structure. Furthermore, two computational models are constructed using Support Vector Machine (SVM) and Artificial Neural Networks (ANN), respectively. Finally, two important parameters C andĪ³related to the model performance are optimized using GA for the first model, siRNAFilter-SVM. The second model, siRNAFilter-ANN, is also optimized using bagging.TClass classification system automatically finds the best feature subset with nine free energy features in it, and the accuracy is 74.67%. Based on the independent test dataset, the sensitivity and specificity are 16.3% and 92.1%, and 20.5% and 94.2%. for the model siRNAFilter-SVM, and siRNAFilter-ANN, respectively. Therefore,,the model siRNAFilter-ANN has better performance.Based on the nine free energy features extracted from mRNA secondary structure of siRNA-mRNA binding region, we get the classification accuracy as higher as 74.67%, which indicates that the mRNA secondary structure plays an important role in determing RNAi efficiency. Compared to the model siRNAFilter-SVM and other previously presented classifiers, the model siRNAFilter-ANN has better performance from the point of view of both sensitivity and specificity, and can be used to design siRNAs for RNAi experiments.
Keywords/Search Tags:RNAi, Genetic Algorithm, Support Vector Machine, Artificial Neural Networks
PDF Full Text Request
Related items