A machine learning approach to prediction of RNA editing events

Posted on:2011-08-22

Degree:M.S

Type:Thesis

University:Lehigh University

Candidate:Stoev, Ivan

Full Text:PDF

GTID:2448390002950219

Subject:Computer Science

Abstract/Summary:

Adenosine-to-inosine (A-to-I) RNA editing is a post-transcriptional process that alters the RNA molecule. It is important to study this process because deficiency or misregulation of A-to-I RNA editing may be the cause of neurological diseases. However, to date the RNA editing machinery is still poorly understood and the number of known recoding editing substrates is still limited. This goal of this thesis is to develop a machine learning approach to prediction of novel editing sites based on a variety of features. The thesis details and implements the Support Vector Machine (SVM) classification algorithm with support for graph and string kernels. The graph kernels enable machine learning from RNA foldback structures -- secondary structures computed by the RNA Editing Dataflow System (REDS). String kernels allow for learning based solely on nucleotide sequence features. Multiple classifiers are designed and evaluated with training data from experimental lab work done at Lehigh University. In addition, due to difficulties of determining a truly negative class (sites that never undergo editing), experiments with the single-class SVM on some of the classifiers were run. Our results indicate that the mismatch kernel [Leslie et al., 2004] classifier generalizes the best out of all classifiers we tested. The mismatch kernel classifier achieved precision rate of 0.88 and sensitivity rate of 0.82 in leave-one-out cross-validation tests. Using this classifier, we suggest new high-confidence RNA editing candidate sites that could be later verified experimentally in the lab.

Keywords/Search Tags:

RNA editing, Machine learning approach

Related items

1	A machine learning approach to automate classification of literature in a SAM research database
2	Using Instance-Level Meta-Information to Facilitate a More Principled Approach to Machine Learning
3	A probabilistic reasoning-based approach to machine learning
4	Cross-project Sotware Defect Prediction Based On Machine Learning
5	Automated cinematography and editing for three-dimensional computer graphics scenes
6	Machine Learning Approach to Retrieving Physical Variables from Remotely Sensed Dat
7	Programming by demonstration: A machine learning approach
8	A bilevel optimization approach to machine learning
9	Research On Machine Learning Approach For Network Anomaly Detection And Response
10	Research On Active Fault-Tolerant Control Of Nonlinear Systems Based On Learning Approach