Thailand is located in the he art of Southeast Asia and plays an important role in cultural exchanges and economic and trade cooperation between China and Southeast Asia.The personnel exchanges between China and Thailand ha ve become more frequent,and the personnel exchanges rely heav ily on language exchanges.Chinese-Thai bilingual neural machine translation belongs to low-resource neural machine translation.Due to the lack of training data,the quality of Chinese-Thai bilingual machine translation is poor.At present,for the problem of scarcity of resources,the method of fusion of syntactic knowledge is generally adopted.The fusion of tree structure is one of the more commonly used methods of fusion of syntactic knowledge.However,most of the e xisting methods of fusion of tree structure require independent codecs and model structures.It is more complex and most of them only fuse the dependency tree structure or the constituency tree structure separately.There is no un ified tree structure fusion method,and the two tree structures can be integrated into the existing machine translation model.How to integrate the two tree structures into the existing machine translation model has become a major difficulty and challenge.This paper proposes a Chi nese-Thai bilingual neural machine t ranslation method based on a tree structured attention mechanism,which integrates the two tree structured information into the machine translation model to improve the quality of Chinese-Thai bilingual machine translation in low-resource situations.This paper mainly completes the following research work:(1)A data acquisition method for Thai constituency syntactic parsing based on deep inside and outside recursive autoencoders is proposed.The acquisition task of Thai constituency syntax data is the basis of the attention mechanism of fusion tree structure.However,due to the lack of mature Thai constituency syntax analysis tools,it is very difficult to obtain Thai constituency syntax data.In order to obt ain high accuracy Thai constituency syntactic parsing data,this paper proposes a Thai constituency syntactic parsing data acquisition method based on deep inside and outside recursive autoencoders.Using a small amount of Thai constituency parsing and labeling data for tr aining,a Thai constituency parsing model with high accuracy is obtained,and finally the Thai constituency syntactic parsing data with high accuracy is obtained.(2)A Chinese-Thai neural machine translation method based o n a bidirectional dependency self-attention mechanism is propo sed.In the case of scarcity of resources,the translation effect of machine translation will be significantly reduced,and the fusion dependency tree structure can effectively alleviate the pro blem of resource scarcity.At present,in most fusion methods of dependency tree structure,a single-direction traversal method is generally adopted,and more comprehensive syntactic structure information cannot be obtained.Therefore,this paper proposes a dependency syntactic knowledge fusion method based on bidirectional dependency self-attention mechanism.The bidirectional dependency knowledge is integrated into the multi-head attention mechanism of the Transformer encoder.Using the bidirectional dependency self-attention mechanism,the translation model can simultaneously perform the dependency information of the parent word to the child word and the child word to the parent word in the dependency relationship.Follow for more comprehensive syntactic structure information.Experiments show that the fusion of bidirectional dependency self-attention mechanism can improve the average0.88 BLEU scores in Chinese-Thai bilingual neural machine translation.(3)A Chinese-Thai neural machine translation metho d based on the constituency structure self-attention mechanism is proposed,and the constituency structure self-attention mechanism and the bidirectional dependency self-attention mechanism are integrated into the Transformer model by using a unified fusio n structure.Most of the current tree structure attention mech anism fusion methods only use a single dependency tree structure or constituency tree structure to integrate into the translation model,and th e current fusion methods of these two tree structur es are very different,making these two tree structures.There is an incompatibility problem with the fusion of structures.Therefore,this paper proposes a Chinese-Thai bilingual neural machine translation method integrating the constituency structure attention mechanism,using a simple source language sequence-to-matrix conversion method to construct a constituency structure matrix,using a structure that is unified with the bidirectional dependency self-attention mechanism,The two tree structures are fu sed into the Transformer model.The constituency structure self-attention mechanism uses the fusion of the constituency structure matrix to pay attention to the combination relationship between words and words in the s ource language sentence,and the bidirectional dependency self-attention mechanism pays attention to the long-distance dependencies in the sentence,and finally improves the Chinese-Thai bilingual neural machine.The translation effect of the translation.Experi ments show that the fusion constituency structure attention mechanism can effectively improve the translation quality,and it also proves that the unified structure fusion has a certain effect on the improvement of machine translation quality.(4)Chinese-Thai bilingual neural machine tr anslation prototype system.Based on the research in Chapter 4,this chapter designs and implements a Chinese-Thai bilingual neural machine translation prototype system based on tree structured attention mechanism.Under the Py Torch-based deep learning framework,the translation model is trained,and the translation results are visualized using Flask and HTML. |