ATC communication is the main communication mode for information exchange between controllers and pilots.Ensuring the accuracy of communication is the key to maintaining air safety and improving flight efficiency.Radio interference,navigation noise,speech speed and control fatigue cause the wrong expression and understanding of control instructions,resulting in the continuous occurrence of aviation safety events.Therefore,it has practical application value to convert the ATC communication into the control instruction text after being processed by speech recognition technology.Based on the cyclic neural network model,this thesis studies the speech recognition of ATC communication:In this thesis,an end-to-end recurrent neural network model is proposed,which combines three bidirectional recurrent neural networks: bidirectional recurrent neural network(Bi RNN),bidirectional long short-term memory(Bi LSTM),Bidirectional gated circulation unit(Bi GRU)and connection timing classification(CTC)are combined to build an acoustic model.The experimental results show that the overall performance of Bi GRU-CTC model is better.On the basis of good recognition results,its training complexity is lower and the training time is shorter,which is more suitable for civil aviation ATC communication speech recognition task.In order to solve the problems that it is difficult to obtain voice data in the field of civil aviation,the directly obtained acoustic model recognition effect is not ideal and the generalization effect is poor.Based on the strategy of transfer learning,this thesis carries out acoustic model training on the open source data set,stores and migrates the model training parameters to the voice data set training in the field of civil aviation;At the same time,the training data is expanded by using the method of data enhancement to enhance the generalization ability and robustness of the acoustic model.The experimental results show that Bi GRU-CTC model,which integrates data enhancement and transfer learning strategy,has a good performance in civil aviation voice data set.The end-to-end recurrent neural network model is a data-driven model,and it does not make good use of language knowledge for text modeling.In this thesis,the Bi GRU-CTC acoustic model and the GRU language model are integrated,and the language model is trained separately to supplement the whole system semantically,which can further improve the speech recognition effect. |