Font Size: a A A

Study Of Artificial Intelligence Flight Co-Pilot Speech Recognition Technology

Posted on:2021-03-04Degree:MasterType:Thesis
Country:ChinaCandidate:L Q HeFull Text:PDF
GTID:2428330602980529Subject:Master of Engineering
Abstract/Summary:PDF Full Text Request
Speech recognition technology,as part of human-computer interaction,is essential for machine intelligence.The utilization of robots as the co-pilot in civil aircraft is a major breakthrough and innovation direction in the civil aviation industry.The application of speech recognition technology to the co-pilot of the robot can make the command of the captain directly to the co-pilot program,making it possible to cooperate between the captain and the robot pilot.The speech corpus is the basis for speech recognition.At present,research scholars use speech corpus published by Tsinghua University to research on speech recognition,but this speech corpus is not suitable for a particular research direction.The current speech recognition method is divided into traditional speech recognition methods and end-to-end speech recognition methods.Traditional speech recognition methods have developed maturely and have good recognition effects,but the procedures are too complicated.In view of the above background,a standard yelling speech database was established in this paper,and this speech database was about between the Pilot-Monitoring and Pilot-Flying of the A320 cockpit.The CTC speech recognition is one of the end-to-end speech recognition and was used to build speech recognition model.Firstly,the process that the robot needs to go through as the co-pilot of the aircraft,the principles and processes of traditional speech recognition,end-to-end speech recognition,sketches recurrent neural networks and speech corpus were summarized and elaborated.Secondly,a standard speech corpus that contains a total of 22 standard shouts and was divided into six groups was built.The recording language is Mandarin.The sample number of the recorded speech corpus is 150,all from Civil Aviation Flight University of China.They have civil aviation professional background with standard Mandarin,aged between 22 and 32.The speech corpus capacity is 1800,which improves the generalization of the speech corpus.Speech enhancement of the cockpit noise during the flight of the aircraft was performed.Subjective evaluation methods was used to judge the speech enhancement effect of spectral subtraction and minimum mean square error of log-spectral amplitude estimator(MMSE-LSA),and MMSE-LSA was chosen to reduce the noise.And then,speech recognition model based on long-short-term memory recurrent neural network(LSTM)of CTC was established.This system effectively suppresses the gradient disappearance and gradient explosion phenomenon of simple recurrent neural network during model training.The above-designed speech recognition model based on recurrent neural networks of CTC is basically feasible for standard speech corpus in terms of training and testing,but the error rate is relatively high,the training error rate is 31%,and the test error rate is 45%.In order to optimize the model for this phenomenon,two methods were used in the end of this paper,namely the Bi-LSTM recurrent neural network speech recognition model based on CTC and the Bi-GRU recurrent neural network speech recognition model based on CTC.The error rate has been reduced accordingly,especially the Bi-LSTM recurrent neural network speech recognition model based on CTC.The training error rate is reduced to 1.2%,and the test error rate is reduced to 3.2%.This Bi-LSTM speech recognition model is used as the speech recognition system of the artificial intelligence co-pilot.
Keywords/Search Tags:Speech corpus, Speech enhancement, Speech recognition, CTC, LSTM, Bi-LSTM, Bi-GRU
PDF Full Text Request
Related items