Font Size: a A A

The Research And Implementation Of "From-Bottom-to-top" Syntactic Parsing Of Traditional Mongolian Simple Sentence

Posted on:2018-09-14Degree:MasterType:Thesis
Country:ChinaCandidate:R H WuFull Text:PDF
GTID:2348330515952362Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
Syntactic parsing is one of key technologies in the field of Natural Language Processing(NLP),whose basic task is to determine the composition of the sentence.Syntactic parsing needs to combine the complex syntactic structures,so the field has been difficult in NLP and its development is relatively slow.Mongolian is one of the languages that is used by minorities in China.For the complexity of Mongolian features,the research and analysis of Mongolian have been in a slow development stage.Based on the characteristics of Mongolian simple sentences,the syntactic parsing of traditional Mongolian simple sentence uses the method of rule-based in the thesis.The main contents of this article include the following four points:(1)Establishing recovery rules of subject word and case.It often appears the situation which omits the case and subject word in the Mongolian simple sentences.Through the studies on the sentence patterns omitting the case and subject word,the subject word recovery rules are formulated,and then the purpose which analyzes parts of the sentence is achieved better.(2)The syntactic parsing of traditional Mongolian simple sentence is based on the rules by the from-bottom-to-up method.The thesis puts forward a from-bottom-to-up method according to the feature of traditional Mongolian simple sentences,and also designs and implements the algorithm.The experimental results show that the recovery of case and subject word has greatly improved the accuracy of sentence component partition.(3)Mongolian part of speech tagging.The method of POS tagging in this thesis is based on the dictionary and rule.Firstly,we use the dictionary tag POS of single word and phrase.Then the POS tagging of muti-category words and unregistered words use the rule-based method.The experimental results show that the accuracy of POS tagging is over 95%,which can meet the requirement of syntactic analysis.(4)The improvement of dictionary base and the establishment of the rule base.According to the characteristics of Mongolian words,the thesis establishes the affix library and improves the original dictionary.In the rule base,there are 141 verb rules(except the archaic verb rules),38 noun rules and 15 adjective rules.The experimental results show that the perfection of rule base improves the accuracy of POS tagging greatly.
Keywords/Search Tags:syntactic parsing, traditional Mongolian simple sentence, recovery rules of subject word and case, from-bottom-to-up
PDF Full Text Request
Related items