Automatic annotation of multiple protein sequence alignments using recurrent neural networks

Posted on:2007-01-23

Degree:M.C.Sc

Type:Thesis

University:Dalhousie University (Canada)

Candidate:Aggarwal, Aditya

Full Text:PDF

GTID:2458390005988347

Subject:Computer Science

Abstract/Summary:

PDF Full Text Request

Manual annotation of multiple sequence alignments for phylogeny is a time consuming and non trivial task. Especially in the genomic-scale analyses, manually annotating hundreds or thousands of sequence alignments is not a practical option. To automate this process, we present the application of three architectures of neural networks namely, multilayer feed forward network, recurrent neural network and bidirectional recurrent network to detect regions of intrinsically poor alignments. Parameters were generated on a set of manually annotated Pfam multiple protein sequence alignments, forming the training and testing data for the networks. The system is designed to capture noisy sites (i.e. inadequate class) and informative sites (i.e. the valid class). Of the three architectures multilayer feed forward network with no window size provided the highest classification precision for the valid class at 94.6%. The best performance for the prediction of inadequate sites occurred using the bidirectional recurrent neural network 92.78%. The different classifiers have the ability to annotate multiple sequence alignment for the purpose of editing. This method is especially useful as a pre-processor for phylogenetic analyses at the genomic scale.

Keywords/Search Tags:

Sequence alignments, Multiple, Recurrent neural, Network

PDF Full Text Request

Related items

1	The Study Of Heuristics Method For Multiple Sequence Alignments
2	A Simulated Annealing Approach To Multiple Sequence Alignment
3	Efficient construction of accurate multiple alignments and large-scale phylogenies
4	Multiple alignments of protein structures and their application to sequence annotation with hidden Markov models
5	An Anchor-based Algorithm For Multiple Genome Alignment
6	Research On Some Problems Of Image Sequence Recognition Based On Recurrent Neural Networks
7	Multiple Sequence Alignment Algorithm Research And Implementation
8	Research On Speech Synthesis Algorithm Based On Sequence To Sequence Model
9	Research On Sequence Recommendation Method Based On Hybrid Neural Network
10	Research On Recurrent Neural Network Based Dependency Parsing Model