Font Size: a A A

On Word Alignment Models for Statistical Machine Translation

Posted on:2012-07-08Degree:Ph.DType:Dissertation
University:University of RochesterCandidate:Zhao, ShaojunFull Text:PDF
GTID:1458390008499474Subject:Computer Science
Abstract/Summary:
Machine translation remains the holy grail of computational linguistics. All statistical machine translation systems are built upon the idea of word alignment. While the field of word alignment has had tremendous progress in the last two decades, it is still in great need of speed and quality improvement.;We designed a fertility hidden Markov model for word alignment, which is dramatically faster than the most widely used 113M Model 4. In fact, our model is even faster and has lower alignment error rate (AER) than the hidden Markov model. An experiment on Chinese-English translation shows that our word alignment model leads to better translation results than IBM Model 4, based on the BLEU metric.;We also designed algorithms that mine massive and high quality bilingual texts for a variety of language pairs from the web using word alignment. The resulting data improved a state-of-the-art machine translation system.
Keywords/Search Tags:Word alignment, Translation, Machine, Model
Related items