Font Size: a A A

The Technology Of Offline Speech Translation

Posted on:2020-05-03Degree:MasterType:Thesis
Country:ChinaCandidate:Z ZhanFull Text:PDF
GTID:2428330605950766Subject:Electronics and Communications Engineering
Abstract/Summary:PDF Full Text Request
In recent years,artificial intelligence has become a research hotspot.With the integration of the global economy,exchanges between countries around the world have become more frequent.Language differences have always been a language gap between globalization and international travel,which has brought serious obstacles to communication between people of different languages.As an important field of artificial intelligence,speech translation can break the language barrier and improve the communication ability between people in various countries.Speech translation technology covers three major technologies: speech recognition,machine translation and speech synthesis.The thesis mainly uses speech recognition technology and machine translation technology as the research object to realize offline speech translation.Traditional speech recognition is based on a Gaussian Mixture Model combined with a Hidden Markov Model(GMM-HMM)for acoustic modeling,which is mature and stable.However,the GMM model is a shallow model,and the ability to model large-scale corpus data sets is difficult to improve.With the rise of deep learning,Deep Neural Networks(DNN)uses its own unique structure to have more powerful learning and modeling capabilities for complex data.The thesis conducts in-depth research on two acoustic models of GMM-HMM and DNN-HMM,and builds an offline speech recognition system.At the same time,it studies the phrase-based statistical machine translation,and finally combines with the speech recognition system to realize a Chinese-English offline speech translation system;mainly to complete the following work:(1)Independent research on each module of speech recognition.The main research objects include speech signal preprocessing,acoustic feature extraction,acoustics model,language model and decoding.(2)The robustness of speech recognition is researched,and the speech anti-noise technology is mainly researched.By using wavelet transform to deal with non-stationary speech signal with multi-resolution good performance,a log-based wavelet threshold denoising speech enhancement algorithm is proposed.The effectiveness of the improved algorithm is verified by corresponding denoising experiments and compared with other algorithms.(3)The structure and training methods of GMM model and DNN are analyzed,and the acoustic modeling is researched.Firstly,GMM-HMM is used as the baseline system of speech recognition,and the acoustic elements are modeled by monophone and triphone.Then the DNN-HMM acoustic model is established and the recognition performance of the two models is studied through experimental comparison.The experimental results show that the DNN model is better than the GMM model,and the phoneme error rate and word error rate are r educed by 5.66% and 3.48%.Finally,an offline speech recognition system is built and the recognition effect is tested.(4)The Mel frequency cepstral coefficient(MFCC)acoustic feature and the Mel filter banks(Fbank)acoustic feature are used as the input data of the training DNN model respectively.The corresponding DNN-HMM acoustic model is established through training,and the effects of the two features on the recognition results are compared.The results of the experiment show that the Fbank feature is more suitable for the training of the DNN model.At the same time,the number of filter banks of Fbank acoustic characteristics is explored.By changing the number of filter banks,the influence on the recognition results is studied.(5)Finally,the phrase-based statistical machine translation is researched.The model required for translation is obtained by training the text data set,and the offline speech recognition system is combined with the offline speech recognition system.
Keywords/Search Tags:Offline, Speech recognition, Machine translation, Anti-noise technology, DNN
PDF Full Text Request
Related items