Font Size: a A A

Product models for sequences

Posted on:2003-05-14Degree:Ph.DType:Thesis
University:University of Toronto (Canada)Candidate:Brown, Andrew DenisFull Text:PDF
GTID:2468390011981792Subject:Computer Science
Abstract/Summary:
This thesis introduces a series of new graphical models for time series. These models all incorporate the idea of constructing a complex time series density model by combining the densities of many simpler models.; The first of these, a product of hidden Markov models (PoHMM), is a graphical model whose density over a sequence is the product of the density of many small hidden Markov models (HMM's). Because each HMM in the PoHMM has its own hidden state variable the PoHMM has a distributed hidden state which allows for more compact descriptions of complex observations. In addition, the product formulation makes the hidden state of each constituent HMM independent conditioned on the data, allowing for efficient inference. However, maximum likelihood learning is complicated by the normalizing constant of the product distribution, which couples all the parameters in the PoHMM. The thesis presents the contrastive divergence algorithm, an approximate, sampling method for performing maximum likelihood learning in a PoHMM and other product models. Experiments comparing a PoHMM trained using contrastive divergence to a standard sized HMM, trained by EM, show the PoHMM to be superior in modelling complex structure and long range structure in sequences.; The second model, the relative density network (RDN), is a supervised network which learns the relevant features of a class of data by using many small HMM's. Unlike, other methods of supervised classification of sequences, it does not insist on using a single model for each data class, but uses multiple models, and lets the network decide how to allocate them among the classes. A set of experimental comparisons with other HMM based supervised learning schemes on a problem in speaker recognition shows it to have better classification performance.; The last model, the spiking Boltzmann machine, is also a product model. It uses a combination of binary and continuous hidden state representation, with smooth temporal dynamics so that it may model slowly varying signals. An example with synthetic data shows that it can learn to model and predict simple dynamic image sequences.
Keywords/Search Tags:Model, Product, Sequences, Hidden state, Data, HMM
Related items