Font Size: a A A

Research On Machine Translation Based On Deep Neural Network

Posted on:2021-04-24Degree:MasterType:Thesis
Country:ChinaCandidate:Z Y WeiFull Text:PDF
GTID:2428330623967964Subject:Statistics
Abstract/Summary:PDF Full Text Request
In today's world,the rapid development of human society and economic society has transformed the mutual cooperation between countries.The demand for machine translation in human society has also increased rapidly,and the advancement of artificial intelligence technology has also raised new requirements for the quality of machine translation.At the same time,the development of machine translation research has a bench-marking effect on other areas of natural language processing.Therefore,the research on machine translation has high practical value,and it can promote the progress of natural language processing theory research.Machine translation models can be divided into two categories: statistical machine translation(SMT)and neural machine translation(NMT).The neural machine translation model is a translation model that uses deep learning technology and is completely dependent on neural network construction.It is mainly composed of an encoder and a decoder.The classic "encoder-decoder" model mainly uses a recurrent neural network.However,since the recurrent neural network itself is not suitable for superimposing deep networks,it is difficult to improve the machine translation model's performance by superposing multiple layers of networks.The currently popular Transformer model completely abandons the recurrent neural network while extending the "encoder-decoder" framework.The Transformer model can perform multi-layer network overlay,but because it discards recurrent neural network,it loses the position information characteristics of the input sequence.In order to solve this problem,the model adds position information vectors in the process of transforming the text.Based on the consideration of the above issues,this paper conducts the following work and research:(1)Aiming at the first classic recurrent neural network-based translation model,an independent recurrent neural network is used as the model network structure here.Through analysis and derivation,it is proved that the network can not only maintain the basic sequence position characteristics of the recurrent neural network,but also effectively solve the problems of gradient disappearance and gradient explosion.At the same time,the network can improve the model performance by overlapping multiple layers of networks.(2)Experimental verification of the improved first model.In the experiment,three sets of control experiments were set up,the network structures of which were recurrent neural network,long short-term memory(LSTM),and GRU.The comparison analysis shows that on the experimental data set,the performance of the independent recurrent neural network is better than the other three groups of control experiments,and the improvement of the model after stacking multiple layers is higher than the other three models..(3)Aiming at the second Transformer model,it is inspired by the linguistic cognitive nature of its text encoding process and translation work.Here,the part-of-speech information of the words in the text is integrated into the encoding process,and added to the word vector representation and the position vector representation as the final input vector.(4)Experimental verification of the improved second model.In the experiment,the original Transformer model was used as a baseline model for comparative analysis.The experimental results show that the transformation effect of the Transformer model after adding part-of-speech information is better than the baseline model.
Keywords/Search Tags:machine translation, Recurrent Neural Network, Part-of-speech information
PDF Full Text Request
Related items