Font Size: a A A

Research On Thai Dependency Syntax Analysis Method Based On Cross-Language Transfer Learning

Posted on:2018-10-22Degree:MasterType:Thesis
Country:ChinaCandidate:G F TaoFull Text:PDF
GTID:2358330518961967Subject:Electronic and communication engineering
Abstract/Summary:PDF Full Text Request
With the development of computer science and artificial intelligence,Natural Language Processing which is a branch of artificial intelligence has played a significant role in promoting the economic and cultural development.So research work in Natural Language Processing is particularly important.Syntax analysis is an important research content of Natural Language Processing,which is the basis of Machine Translation,information retrieval and text analysis.At present,Chinese,English and other languages using traditional dependency parsing method to do syntactic parsing,but this relatively mature syntactic parsing method rely on large-scale annotated corpus and complex feature template.Annotating corpus and formulating feature template is time-consuming and laborious,which makes it difficult for resource-scarcity language to use the traditional dependency parsing method.Above all,this paper puts forward a cross-linguistic transfer learning method to do syntactic analysis for resource-scarcity language.Research of Thai syntax is lack of corpus,so the dependency syntactic parsing of Thai is accomplished as follows:(1)Based on a parallel sentence on the basis of the neural network representation of bilingual word distribution.The research work of Thai language is relatively small,and there is no large scale corpus,which increases the difficulty of the study of Thai Natural Language Processing.However,Chinese and Thai belong to the Sino Tibetan language family,and the two languages are very similar in syntax.Compared with the rich corpus of Chinese Natural Language Processing,Thai can learn from Chinese.Bilingual word distribution can build the links between the two languages,so this paper based on a parallel sentence corpus of bilingual word distribution representation model,the experimental results show that the accuracy of word distribution said up to 82.60%.(2)Research on the dependency parsing of Thai language based on transfer learning.The study of Chinese syntactic analysis is relatively mature,therefore,based on the bilingual word distribution method,using 40000 parallel sentences to Chinese corpus,through the method of feature migration from Chinese dependency syntactic parsing of Thai language.In this paper,the dependency parsing model of neural networks is proposed.The accuracy of the dependency arc,the accuracy of the identification and the accuracy of the root nodes of the neural network are 79.28%,75.01%and 91.25%,respectively.(3)Visualization of Thai dependency syntax analysis system.The model is loaded into the Java language and outputting dependency statement in CoNLL format at the same time,using DependencyViewer tools to generate the display interface.This can facilitate the observation of the whole sentence level dependency view and tree view.Through the above analysis,the cross-linguistic transfer learning method solve the problem that the corpus of Thai is scarce to a certain degree.And we also consider the characteristics of Thai language and use transfer learning method to do syntactic parsing of Thai.Bilingual word distribution representation is the basis of syntactic parsing based on transfer learning,syntactic parsing based on transfer learning is a specific application of bilingual word distribution representation.And the parsing has achieved good results.
Keywords/Search Tags:Thai Syntactic Analysis, Dependency Parser, Transfer Learning, Neural Network, Bilingual Word Embeddings
PDF Full Text Request
Related items