Font Size: a A A

Researching Of Mongolian Word Segment Based On Cascaded Hidden Markov Model

Posted on:2010-04-03Degree:MasterType:Thesis
Country:ChinaCandidate:W CongFull Text:PDF
GTID:2178360278467870Subject:Computer application technology
Abstract/Summary:PDF Full Text Request
In the stages of word processing of the Mongolian language information processing, automatic segmentation etyma, suffixes are also the basis of many follow-ups. Such as: the calculation of the etyma and affixes , the preparation of all kinds of dictionaries, sentences and chapters processing and information retrieval all need the correct "split" as the basis.Mongolian language is adhesive, and its formation and the configuration are all accomplished on the basis of different suffix,which are connectted by the roots and etyma. Then these additional components include many grammar information, if we consider the Mongolian words as a whole to deal with, we will lose many information. Only by the right segmentation of the etyma and suffixes can we reveal the word property and grammatical relationships.In this paper, I introduced the source and meaning of the Mongolian word segmentation, and the existing Mongolian word segmentation techniques are compared and analyzed ;discussed the relevant theory and the technique of the Hidden Markov Model , introductioned the means and advantage of Cascaded Hidden Markov Model briefly ;analyzed the Mongolian language characteristics and abstracted its feature; analyzed the key technique concerned of the Mongolian word segmentation Based on Cascaded Hidden Markov Model ,for the characteristics of the Mongolian word formation and configuration, with the Cascaded Hidden Markov Model I segment Mongolian word; and analyzed the smoothing techniques which Commonly used, introductioned Modified Kneser-Ney smoothing emphatically. From the experimental results, the Mongolian word segmentation based on the Cascaded Hidden Markov Model has better performance ,and gets the satisfactory segmentation results:0.9713.
Keywords/Search Tags:Word Segment, Etyma, Suffix, Cascaded Hidden Markov Model, Mongolian
PDF Full Text Request
Related items