Research On The Key Techniques For Neural Machine Translation

Posted on:2021-04-05

Degree:Master

Type:Thesis

Country:China

Candidate:Y N Chen

Full Text:PDF

GTID:2518306017472904

Subject:Intelligent Science and Technology

Abstract/Summary:

PDF Full Text Request

Machine Translation(MT)uses computer algorithms to realize the automatic conversion in different natural languages,which has high theoretical and applied research value.In recent years,with the development of deep learning technology,Neural Machine Translation(NMT)has made breakthrough,completely outperforming the traditional Statistical Machine Translation(SMT)system in effect.Although the translation performance of NMT has been greatly improved in all aspects,the translation of long sentences is still unsatisfactory.This is due to the complex structure of long sentences and the limited memory capacity of current translation models.In addition,the lack of punctuation marks further restricts the translation effect of long sentences in practical tasks such as picture translation and speech translation.The thesis mainly focuses on these two aspects,aiming at improving the translation quality of long sentences by improving the memory capacity of the model and solving the problem of missing punctuation marks in speech translation.The main work and innovation of the thesis are as follows:1.To improve the memory capacity of the translation model,a Neural Machine Translation model based on Recurrent Expert Unit(REU)is proposed.REU enhances the parameter capacity of Recurrent Neural Networks(RNNs)with multiple expert units,and is equipped with a context-aware gating function to balance the information flow from different experts.Finally,the top-k gating function is introduced to reduce the computational complexity.The results of machine translation experiments on WMT17 Chinese to English and WMT14 English to German datasets have shown that the proposed method can significantly improve the translation performance of NMT,especially for long sentences.2.To solve the problem of missing punctuation marks,a punctuation recovery method based on Bidirectional Encoder Representation from Transformers(BERT)and Focal Loss is proposed.The method uses BERT to extract strong semantic features,and utilizes Focal Loss as the loss function in the model training process to alleviate the class imbalance problem of more unpunctuated samples than punctuated samples.The effectiveness of the punctuation recovery model has been verified by speech translation tasks in the thesis.The results of speech translation experiments on the self-built Chinese-English and IWSLT15 English-German datasets have shown that the punctuation recovery method based on BERT and Focal Loss can significantly improve the translation quality,especially the long sentences.

Keywords/Search Tags:

Neural Machine Translation, Speech Translation, Memory Capacity, Punctuation Recovery

PDF Full Text Request

Related items

1	Research On Integrating Translation Memory Into Neural Machine Translation
2	For Machine Translation Of Spoken Punctuation Filling Technology Research
3	Methods For Handling OOV In Chinese-uyghur Neural Machine Translation
4	The Design And Implementation Of Bidirectional Translation Memory Subsystem In Multilanguage Computer Aided Translation System
5	Multi-source Information Enhanced End-to-end Neural Machine Translation
6	Research Of Optimization Methods Integration And Translation Rerank For Mongolian-chinese Machine Translation
7	Research And Implementation Of Uyghur-Chinese Machine Translation Based On Data Augmentation Technology
8	Research On Chinese-to-english Machine Translation Based On Neural Network
9	Research And Design Of Machine - Aided Translation System Based On Localized Development Process Optimization
10	Realization Of Design And Evaluation Of System For Speech Translation Lexicon