Font Size: a A A

Research And Application On Speech Recognition For Civil Aviation Air-ground Communication

Posted on:2021-02-28Degree:MasterType:Thesis
Country:ChinaCandidate:K ZhouFull Text:PDF
GTID:2518306479965019Subject:Master of Engineering
Abstract/Summary:PDF Full Text Request
Air-ground communication is a dialogue between air traffic controllers and pilots,an important way of air traffic control for civil aviation,and an important guarantee of flight order and safety.It can be used to analyze the causes of accidents and give warning of potential risks that may lead to safety accidents.In addition,the speaker recognition of the accident voice segment can effectively and accurately identify the relevant responsible person in the accident analysis.However,the current general speech recognition technology and speaker recognition technology are not effective for the air-ground communication data due to the particularity of air-ground communication.The main reasons are as follows: first,the background noise is large;second,the speed of speech is fast,so the results of common model recognition loss words seriously;third,the data annotation of air-ground communication is difficult,and the available data is less.Therefore,how to improve the existing model and algorithm to adapt to the air-ground communication task is a problem worthy of study.This thesis makes an in-depth study on the above three issues,and the innovative work is as follows:1.In order to recognize air-ground communication data,this thesis compares the traditional GMMHMM(Hybrid Gaussian Model-Hidden Markov Model)and DNN-HMM(Deep Neural NetworkHidden Markov Model)with the state-of-the-art end-to-end model,and presents the most suitable hybrid CTC-attention(Connectionist Temporal Classification-attention)end-to-end model for airground communication data.In CTC-attention model,the integration of CTC algorithm avoids the necessary and time-consuming alignment work in GMM-HMM and DNN-HMM models,and the integration of attention model improves the accuracy of decoding and adaptability to fast speed speech.2.To solve the problem of large background noise of air-ground communication data,this thesis adds CNN(Convolutional Neural Network)structure at the front of the model to extract strong robustness features.To solve the problem of fast speed of air-ground communication data,this thesis introduces CTC in the decoder to assist decoding to strengthen the robustness of the model to fast speed speech.The experimental results show that compared with the original CTC-attention,the character error rate of the improved CTC-attention is reduced by 28.13%,and the deleted error is reduced by 32.7%.The validity of the improved CTC-attention model has been proved.3.In order to apply speaker verification technology to the air-ground communication data,this thesis compares the traditional GMM-UBM model,i-vector,d-vector and x-vector,and finds that the xvector eigenvector has strong robustness,so it is more suitable to deal with the impact of noise and speed in land air communication data.Therefore,x-vector is used for the speaker verification in the air-ground communication data in this thesis.4.To solve the problem that the air-ground communication data is too small to train the DNN of xvector,we use the method of transfer learning in this thesis.However,if we just fine tune the DNN directly,the problem of negative transfer will occur,which will lead to the decline of recognition effect,this thesis proposes a data-driven transfer learning method.This method reduces the classifier parameters of x-vector so that it is only affected by the data itself in the transfer process,but not by the free parameters.Experimental results show that the data-driven transfer learning method can effectively improve the performance of the extracted x-vector in low resource environment.
Keywords/Search Tags:Air-ground communication, Speech recognition, Speaker verification, End-to-end model, Transfer learning
PDF Full Text Request
Related items