Font Size: a A A

A collection of novel algorithms for low memory HMM decoding allowing for parallelization and variable sequences

Posted on:2008-03-28Degree:Ph.DType:Dissertation
University:Washington University in St. LouisCandidate:Keibler, Evan MFull Text:PDF
GTID:1448390005479325Subject:Biology
Abstract/Summary:
The hidden Markov model (HMM) has proven to be a valuable tool for probabilistic modeling in fields from computer vision to speech recognition. Recently the HMM and its variants have been successfully applied to many problems in computational biology. Memory usage and running time requirements of standard algorithms make applying these models to large sequences (such as full human chromosomes) infeasible. Here we present a novel algorithm for optimal decoding of large sequences in low, near constant memory for any length sequence. We then extend this approach to allow for optimal decoding of a single sequence using multiple processors in parallel. Using concepts derived from our work on low memory and parallel HMM decoding, we then present an algorithm for decoding sequences containing variable character (e.g. SNPs). This algorithm produces the optimal decoding for each possible variant of the sequence being considered. We implemented these algorithms in gene prediction and sequence alignment software and present results of genome scale experiments.
Keywords/Search Tags:HMM, Sequence, Decoding, Algorithms, Memory, Low
Related items