Font Size: a A A

The Design And Implementation Of Mongolian Word Analyzing And Correcting Based On Syllabic Statistical Language Method

Posted on:2008-11-30Degree:MasterType:Thesis
Country:ChinaCandidate:J ZhaoFull Text:PDF
GTID:2178360215491526Subject:Computer application technology
Abstract/Summary:PDF Full Text Request
With the rapid development of information society, there are more and moreelectronic books, papers, and document in our work. How to resolve the automaticdetection and correction of text have been a warm focus by natural languageprocessing (NLP) researchers. Nowadays, in Mongolian information processing area,the automatic correction for Mongolian text has not been well studied yet.Researchers have been using the method based on dictionary for correction so far.This method work well when the amount of word in dictionary is not large. And withthe growth words, the efficiency of correction is decreasing. The goal of this paper isto put forward a new method for problems in Mongolian text correction. The mainwork in this paper includes:First, some knowledge of Mongolian syntax is introduced. And the syllablecharacters in Mongolian words will be analyzed from the different perspectives, e g.the length of Mongolian word, the amount of syllable in a word and the location ofsyllables.Second, this paper introduces some well known language models used in naturallanguage processing and the algorithms of text similarity computing. And a method for Mongolian correction based on 2-gram is put forward. The design of proofreadingmodel, the algorithm for model learning and the algorithm for Mongolian correctionmodel are introduced in detail. The rules of text errors have been learned anddisplayed by directed graphs in this paper.
Keywords/Search Tags:automatic proofreading, n-gram model, Mongolian scripts, syllable
PDF Full Text Request
Related items