Font Size: a A A

Deep Learning Models for RNA-Protein Bindin

Posted on:2018-01-30Degree:M.A.SType:Thesis
University:University of Toronto (Canada)Candidate:Gandhi, ShreshthFull Text:PDF
GTID:2448390005953750Subject:Electrical engineering
Abstract/Summary:
RNA binding proteins(RBPs) are crucial bio-molecules that fine-tune gene expression in cells. Each RBP prefers to bind to a specific RNA sub-sequence, like a key fitting a lock. Understanding the specific binding preferences of RBPs is an important step to understanding the various steps of gene expression in cells and in solving several genetic disorders. There are thousands of RBPs in humans and only a small fraction of them are well understood. In this work, we develop deep neural network models that allow us to learn binding preferences for a large number of RBPs from high-throughput data, without requiring any specific domain knowledge or feature engineering. Deep learning has improved state of the art in several fields such as image classification, speech recognition, and even genomics. Deep learning approaches obviate the need for careful feature engineering by learning useful representations directly from the data. We propose two deep architectures and use them to predict RNA-protein binding. Based on recent findings that show the importance of RNA secondary structure in RBP binding, we incorporate computationally predicted secondary structure features as input to our models and show its effectiveness in boosting prediction performance. We demonstrate that our models achieve significantly higher correlations on held out in vitro testing data compared to previous approaches. We show that our model can generalize well to in-vivo CLIP-SEQ data and achieve higher median AUCs than other approaches. We demonstrate that our models discover known preferences for proteins such as CPO and VTS1 as well as report other proteins for which we find secondary structure playing an important role in binding. We demonstrate the strengths of our model compared to other approaches such as the ability to combine information from long distances along the sequence input.
Keywords/Search Tags:Deep learning, Models, Binding, Rbps, Approaches
Related items