Font Size: a A A

Research On Search Strategy In Phrase-based Machine Translation

Posted on:2016-11-04Degree:MasterType:Thesis
Country:ChinaCandidate:C F ZhuFull Text:PDF
GTID:2308330461956531Subject:Computer technology
Abstract/Summary:PDF Full Text Request
With the development of the globalization process, intercultural communication across the world becomes more and more popular. Especially, the explosive growth of the Internet has brought a huge amount of multi-language information, and plenty of desires for cross-language exchange. An automated translation technology is urgently needed to deal with these problems, which makes the machine translation arise at the historic moment. Among numerous translation systems, phrase-based machine translation system (PBMT) becomes more and more popular in both research and the practical application, as it is efficient and easy to build.The decoding process in PBMT is actually a structured search problem, and is a NP-complete problem. Therefore, the widely adopted search algorithm is heuristic with large number of items pruned in decoding process. The pruning brings a problem called search error, which makes the model error are not the only problem in PBMT. In machine translation, search error means that a best translation hypothesis are not got by the model as it has been pruned in decoding process.Aiming at the search error problem mentioned above, we do some work from following two aspects.First of all, we attempt to reduce search errors in pruning process, by enhancing the comparability between translation hypothesis, and ensure that the pruned hypothesis are more likely to be worse than those preserved. From this perspective, we propose two plans to enhance the hypothesis comparability. Firstly, we use coverage stacks to take the place of the translated-words-number stacks in beam search. Secondly, we come up with a new future cost estimate method called route-based future cost estimate method, to replace the widely used phrase-based future cost estimate method.On the other hand, we propose a new method to compute the potential-BLEU in search-aware tuning frame. We call the BLEU route-based potential-BLEU, as it use a complete translation hypothesis to more accurately calculate the potential-BLEU instead of just piecing together some word sequences.
Keywords/Search Tags:statistical machine translation, search error, decoding, future cost
PDF Full Text Request
Related items