Font Size: a A A

Research On The Conversion Approach Between Cyrillic Mongolian And Traditional Mongolian Based On Rules And Statistics

Posted on:2016-04-05Degree:MasterType:Thesis
Country:ChinaCandidate:H W WangFull Text:PDF
GTID:2308330461983102Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
Mongolian, which belongs to the Altai language family of Mongolia, is a kind of widely used language over different countries and multiple regions, its main users are distributed over China, Mongolia and Russia. A major difference between the Mongolian used in China called "Traditional Mongolian" and that used in Mongolia called "Cyrillic Mongolian" is that they have same pronunciation but different written forms. With the deepening cooperation and exchanges in the aspects of cultural, educational and economic between China and Mongolia, the research on conversion between Cyrillic Mongolian and Traditional Mongolian will become more and more important. It not only will bring more convenience to exchanges between the two countries, but also has great significance for scientific, cultural and educational development of Mongolian.Combining the advantages of rule-based approach and statistical models, in this paper, the approach of CMTMC was researched. This paper adopted rule-based approach to convert the in-vocabulary, and the method based on a statistical model were used to convert the out-of-vocabulary. A part of Cyrillic Mongolian words have more than one corresponds.in Traditional Mongolian, and vice versa, so this paper use the N-gram language model to deal with the problem.Experimental results show that C2T and T2C conversion also have good performance. The word error rate of C2T conversion is 4.12%, and the word error rate of T2C conversion is 9.26%.
Keywords/Search Tags:Cyrillic Mongolian, Traditional Mongolian, Mongolian Conversion, Rule, Joint sequence model
PDF Full Text Request
Related items