Font Size: a A A

Research On Russian-Chinese Military Speech Translation Based On Transformer

Posted on:2023-08-15Degree:MasterType:Thesis
Country:ChinaCandidate:M Y XingFull Text:PDF
GTID:2545307025953289Subject:Foreign Linguistics and Applied Linguistics
Abstract/Summary:PDF Full Text Request
With the acceleration of globalization and informatization,the conflicts and exchanges among countries in the world in the political,economic,cultural and military fields have become increasingly frequent.As one of the important fields of natural language processing,speech translation can translate speech signals into text signals,realize cross-language communication and promote social development.Therefore,this technology has important practical significance and broad development prospects.Compared with machine translation,speech translation started late,and various technologies are not yet mature.Traditional cascaded speech translation is composed of speech recognition model and machine translation model,which has disadvantages such as processing delay,complex model and error propagation.Therefore,end-to-end speech translation technology has emerged.With the increasingly frequent military cooperation and exchanges between China and Russia,the demand for Russian-Chinese military speech translation is also urgent.However,the domestic end-to-end Russian-Chinese military speech translation research is still in its infancy.There are still many difficulties in translating Russian military speech into Chinese text,such as complex sequence conversion and data scarcity.This has a great impact on the training difficulty and translation effect of the model.Based on this,this paper analyzes the characteristics of Russian-Chinese military language,independently constructs a Russian-Chinese military speech corpus,improves and optimizes the end-to-end speech translation model structure,and realizes Russian-Chinese military speech translation under low resource conditions.The main research contents of this paper are as follows:(1)A Russian-Chinese military speech corpus is designed and constructed.101,676 sentence pairs of Russian-Chinese bilingual parallel corpora was obtained from the national defense news release website,the national defense white paper and the United Nations corpus website.A 253.5-hour Russian-Chinese military speech corpus was constructed by combining manual recording and speech synthesis.(2)The methods to alleviate the lack of data resources in Russian-Chinese military fields are studied and put forward.Using the method of analyzing the characteristics of Russian language,the paper constructs a list of Russian military terms,abbreviations,high-frequency words and keywords,and uses the keyword list to screen the corpus.The cascade speech translation model is used to verify the validity of the selected data.The BLEU value on the test set is 21.49.The experimental results show that the data set formed by the method of Russian language feature analysis has a distinctive domain,and can be effectively used in the end-to-end speech translation application in the Russian-Chinese military field.(3)The Russian-Chinese military speech translation model based on Transformer is improved and optimized.One is the improvement of precoding.The convolutional network based on linear gated unit enables the model to learn context information,extract more abstract hierarchical features and reduce gradient dispersion;The second is data enhancement.The enhancement strategy of time warp,frequency masking and time masking is adopted to expand the data scale without increasing the original data and alleviate the problem of data scarcity;The third is pretraining.Pre-training provides good initial parameters for the encoder of the model,which can accelerate the convergence of the model and obtain the acoustic feature representation.Experiments show that the performance of Russian-Chinese military speech translation model based on Transformer can be improved by adopting three optimization methods: improvement of precoding,data enhancement and pre-training.On the basis of corpus and model,the end-to-end Russian-Chinese military speech translation is experimented.The experimental results show that the performance of the Russian-Chinese military speech translation model based on Transformer proposed in this study is still insufficient compared with the traditional cascading speech translation model,but it improves the end-to-end Russian-Chinese military speech translation effect as a whole,which has certain theoretical significance and practical value for achieving Russian-Chinese military speech translation research under low resource conditions.
Keywords/Search Tags:Russian-Chinese speech translation, Transformer, End-to-end, Russian-Chinese military speech corpus
PDF Full Text Request
Related items