Research And Application Of Neural Machine Translation Model Based On Attention Mechanism

Posted on:2022-07-05

Degree:Master

Type:Thesis

Country:China

Candidate:J G Zhang

Full Text:PDF

GTID:2518306524489354

Subject:Master of Engineering

Abstract/Summary:

PDF Full Text Request

With the continuous advancement of artificial intelligence technology,existing machine models have basically reached perceptual intelligence and are moving towards cognitive intelligence.Natural language processing is the foundation of intelligent cognition and a research hotspot in the academic and industrial circles.In order to meet the society's needs for various languages and to make it more convenient for countries around the world to communicate with each other more and more frequently,low-cost machine translation research is gradually flourishing.With the continuous improvement of deep learning technology,machine translation has gradually integrated these methods and strategies,and achieved good results in multiple tasks.But there are still some shortcomings.First of all,most translation models are based on the attention mechanism to solve the problem of word alignment between bilinguals.However,the attention of normalized calculation based on softmax results in a small amount of attention distribution among irrelevant words.Therefore,how to obtain a more precise attention distribution is very important.Secondly,most neural translation models are based on the "encoder-decoder" structure.The translation of the entire model relies on the autoregressive mechanism,so each generation of the next word is based on the completed word,which leads to the low decoding efficiency of the model and the inability to obtain the global message of the translation.Finally,the word vector is the basis for the model to obtain semantic and grammatical information,so how to obtain the word vector containing more comprehensive semantic and grammatical information is very important.Based on the problems mentioned above,this thesis mainly conducts the following researches:1.Aiming at the problem of precise alignment of attention in translation,this thesis uses the sparse normalization method to replace the commonly used softmax normalization,and conducts experimental demonstrations on the neural machine translation system based on Transformer.The experimental results show that by sparsely predicting the related word with the largest weight in the word,the unnecessary weight distribution of irrelevant words is reduced,the problem of inductive bias between data is alleviated,and the accuracy and interpretability of the translation system are enhanced.2.Aiming at the problem that the decoding time of Transformer in the inference stage increases with the square of the translation length,this thesis adopts the cumulative average attention layer to alleviate this problem.In addition,in the neural machine translation model,only the preamble can be used to generate sequence information.This thesis integrates the idea of scrutinizing neural networks,and obtains the global information of related generated sentences through two decodings.The experimental results show that the translated sentences after two decodings are more coherent and the sentence meaning is more complete.3.In view of the fact that most of the current models adopt word-based embedding vector representation,a multi-characterization fusion word vector is proposed,and a method of using character-level coding vector and word-level coding vector to directly concatenate.The word vector of multi-characterization fusion can effectively solve the non-appearing words and some low-frequency words in the vocabulary,can express more complete word meaning information,and directly affect the performance of the entire translation model.It can be seen from the experimental results that the fusion methods and strategies we propose effectively improve the translation effect and quality of the overall translation model.

Keywords/Search Tags:

Neural Machine Translation, Attention Mechanism, Deep Learning, Natural Language Processing, Sparsemax

PDF Full Text Request

Related items

1	Parallel Sequence Decoding In Neural Machine Translation
2	New Machine Translation Models Based On Improved Self-attention Mechanism
3	Research And Application Of Modified Neural Turing Machine With Deep Reinforcement Learning
4	Question Answering Model Based On Self-Attention Mechanism
5	Financial Market Trend Forecast Based On Deep Learning And Natural Language Processing
6	Research On Machine Reading Comprehension Model Based On Multi-hierarchy Self-attention Mechanism
7	Reading Comprehension Model Based On Two-way Attention Mechanism And Conditional Random Field
8	Based On The Generalization Of The Instances Of Machine Translation
9	Design And Implementation Of A Cross-lingual Text Summary System Based On Deep Learning
10	Research On Machine Translation Method Based On Deep Neural Network