Font Size: a A A

An MT-oriented Research On Recognition Of Tibetan Syntactic Functional Chunk

Posted on:2017-02-28Degree:MasterType:Thesis
Country:ChinaCandidate:T H WangFull Text:PDF
GTID:2308330503958933Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
With the gradually deepening of researches on Tibetan linguistic theory. Deeply study syntactic parsing and the semantic understanding of Tibetan has increasingly become the recent hot spot. Tibetan syntactic chunk parsing plays an important role in the further study of Tibetan language progressing. On the one hand, syntactic functional chunk parsing can be used as an effective way to solve the complete syntactic parsing problem. On the other hand, syntactic functional parsing can be directly applied to other natural language processing fields, such as machine translation, automatic question answering, information retrieval and so on. In this paper, we do the syntactic functional parsing through statistical methods that starts with the features of Tibetan itself, the main work is as follows:1.According to the Tibetan syntax functional chunk classification and starting from the current situation of Tibetan, we split the chunking task into functional chunk boundary recognition and functional chunk type annotation. Then use the Tibetan syllable to recognize the syntactic functional chunk boundary. The F value has reached 79.12% in the experiment.2.As the Tibetan chunk data is so poor that we add the word segmentation and POS tagging to the boundary recognition and try to use some common statistical model to recognize it. Through experiments the F value of chunk boundary recognition reaches to 86.76% by the CRFs(conditional random fields) model.3.Based on the syntactic chunk boundary recognition by CRFs in part 2, we introduce the error driven learning method into it. We respectively use the CRFs model and the transformation-based error-driven learning method to improve the result of part2. Then through the analysis of the recognition result, we combine these two ways and get a more effective result. Based on that, we do the syntactic functional chunk type annotation and the F value reaches 83.12%.4.In the end, we add our work to the certain Machine Translation system. Through the analysis by certain examples, the syntactic functional chunking work has been proved to be useful and it can improve the effect of the Machine Translation system.
Keywords/Search Tags:Tibetan, syllable units, syntactic functional chunk, machine learning model, error driven learning strategy, Tibetan English Machine Translation
PDF Full Text Request
Related items