Font Size: a A A

Chinese-Mongolian Statistical Machine Translation System Based On Affix Features

Posted on:2011-12-22Degree:MasterType:Thesis
Country:ChinaCandidate:M N SongFull Text:PDF
GTID:2178360305491542Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
Today, with the rapid increase of information and the increasing frequency of the International exchanges, the potential demand of the machine translation is growing. According to the different theories, Machine translation method can be divided into rule-based, Corpus-based translation and Mixed translation method. Different machine translation methods have their own advantages and limitations. The Rule-based method can be very accurate description of the language features, but it is difficult to cover all of the linguistic phenomenon; The quality translation of the Example-based method is very high; But the hit rate is very low and the requirements of Corpus is very strict. The Statistical translation method can alleviate the bottleneck of knowledge acquisition. But its N-gram model can not solve the problem of long-distance dependencies, and also accompany with imperfect corpus and corpus sparse problems.So no matter which translation methods we choose, we are unable to achieve the desired results. Therefore, the hybrid strategy-based machine translation method becomes the focal point of machine translation research, which can avoid the shortage of methods for each translation, and make the results of translation optimize.As the Mongolian language belongs to adhesive language, and its formation and configuration are all accomplished on the basis of different suffix. From the view of the basic word order sentences, Mongolian is belong to SOV (guest of honor that) type language. As to the translation, the mistake of word formation change and the sentence confusion are more obvious and prominent.Therefore, this article focuses on the characteristics of Mongolian, using open-source statistical phrase-based machine translation of the Moses decoder, and then adds two more characters:the Long-Distance Mongolia Language Model Based on Trigger Pair and Chinese-Mongolian Reordering Model. Finally the experimental results show that the translation result is reasonable, and relative to Moses contrast system, it significantly improves the translation performance.
Keywords/Search Tags:Machine translation, Reorder model, Statistical Language Model, Trigger pair, Moses decoder
PDF Full Text Request
Related items