Font Size: a A A

Research On Dependency Parsing Of Tibetan Language Based On Deep Learning

Posted on:2022-07-23Degree:MasterType:Thesis
Country:ChinaCandidate:J C R DuoFull Text:PDF
GTID:2518306482973379Subject:Computer application technology
Abstract/Summary:PDF Full Text Request
The Tibetan language dependency syntactic analysis is an effective analysis of the sub-components of Tibetan sentences,and obtains a kind of dependency or dominance relationship between the sub-components of Tibetan sentences.The process mainly includes Tibetan word segmentation,part-of-speech tagging in Tibetan morphological analysis,and the analysis of the upper-level dependency relationship.Therefore,Tibetan language dependency syntax analysis is the core task of Tibetan natural language processing,and it plays a pivotal role in Tibetan natural language processing.The Tibetan language is also a unique national language in my country.If there are related tools to automatically analyze and process it,it can save relevant researchers a lot of work time and provide great convenience.However,the corpus available for the study of Tibetan language dependency syntax is relatively small,and the related research work is not mature enough compared with Chinese and English.Therefore,the study of Tibetan language dependency syntax analysis still needs to be strengthened.Many scholars and researchers have used many methods in the study of Tibetan language dependency syntactic analysis,but they have never used neural network methods.In response to the above problems,the research of this paper starts from the existing theories and methods of Tibetan language dependency syntax analysis,deeply analyzes the characteristics of Tibetan language dependency syntax analysis,and combines existing research results to develop a dependency syntax that is more in line with the characteristics of Tibetan language.Analysis and research work.The specific research content is as follows:(1)Integrated Tibetan word segmentation and part-of-speech tagging strategyIn Tibetan language dependency syntactic analysis,word segmentation and part-of-speech tagging are inevitable.Therefore,it is necessary to analyze the corpus construction according to the Tibetan language dependency syntax.Through the artificial construction of Tibetan word segmentation and part-of-speech tagging corpus,this paper proposes a method for parallel processing of automatic word segmentation and part-of-speech tagging of Tibetan text based on Transformer,which paves the way for the syntactic analysis of Tibetan language dependency.(2)Standardization and construction of Tibetan dependency syntax tree bankTibetan dependency syntax tagging data is very scarce.For this reason,this paper proposes a conversion algorithm to convert the 8252 parenthesized Tibetan dependency syntax tree into Co NLL format corpus,which reduces the cost of manually constructing a dependency tree bank.On this basis,2771Tibetan-dependent syntax trees were expanded through artificial tagging,word segmentation and part-of-speech tagging,and finally a total of 11023Tibetan-dependent syntax trees were constructed.Finally,it studies the Tibetan language dependency structure,improves and perfects the Tibetan language dependency labeling norms,and lays the foundation for the Tibetan language dependency syntactic analysis.(3)Syntax analysis of Tibetan language dependence based on deep learningBased on the existing experience and theoretical basis of Tibetan language dependency syntax analysis,and combined with the characteristics of Tibetan syntax,a Tibetan language dependency syntax analysis model based on transition + deep learning is proposed.By using the stack and queue principle of marked Tibetan sentence subsequences transfer,integrate the transfer result into the neural network model.After experiments,the accuracy rates on the test set and verification set are94.59% and 86.44%,respectively.
Keywords/Search Tags:Tibetan language dependency syntactic analysis, dependency tree bank, Tibetan word segmentation, Tibetan part-of-speech tagging, dependency tagging standard
PDF Full Text Request
Related items