Font Size: a A A

Research And Application Of Machine Translation Technology On Recurrent Neural Network

Posted on:2019-02-16Degree:MasterType:Thesis
Country:ChinaCandidate:M WangFull Text:PDF
GTID:2348330563953934Subject:Computer software and theory
Abstract/Summary:PDF Full Text Request
Machine translation refers to the process of converting source language sentences into target language sentences by the computer.Machine translation,which breaks the barriers of communication between different languages,is widely used and has large demand,is an important application direction in the field of natural language processing.Recurrent neural network is a network that contains both feedforward path and feedback path,the feedforward path is similar to the traditional feedforward neural network model,and the feedback path can send the outputs of some neurons to themselves as the current inputs after a moment.This special structure makes it possible for the network to capture the time-series information better,aims at the defect that the machine translation technology cannot capture the context information well,and improves the effect of traditional machine translation.However,there are still many defects of the machine translation technology based on the recurrent neural network,including long sentence translation defect,readability defect,translation pretermission defect,and so on.This thesis designs a data processing method based on the characteristics of the actual application data to transform the input data from the original data into the machine translation model.This thesis aims at the defects of the machine translation technology based on the recurrent neural network and designs an enhanced machine translation model.The main work is as follows:(1)This thesis analyzes the forms and defects of the application data,and proposes a data processing method based on the language model and the sentence similarity.This method includes the processing,cleaning and screening of the original application data,and finally builds a training data set for the machine translation model with a higher quality.(2)This thesis analyzes the defects such as long sentence translation defect,readability defect and translation pretermission defect existing in the machine translation technology based on the recurrent neural network,and proposes a machine translation model based on the chunk principle to improve the effect of machine translation;at the same time,this thesis proposes a Beam Search algorithm which based on the language model,and combined with the length penalty strategy,which assists machine translation model to generate the final machine translation results when testing.(3)The data processing methods and the machine translation techniques are respectively contrasted to evaluate the data processing method based on language model and sentence similarity and the machine translation model based on chunk principle.The experimental results show that the data processing method based on language model and sentence similarity can effectively guarantee data quality and indirectly improve the performance of the machine translation model;and the chunk machine translation model can improve the traditional machine translation based on the recurrent neural network,the translation effect is improved compared to the Encoder-Decoder model and so on.
Keywords/Search Tags:machine translation, chunk, language model, sentence similarity, length penalty strategy
PDF Full Text Request
Related items