Air Traffic Control Speech Recognition Based On Deep Learning

Posted on:2020-05-18

Degree:Master

Type:Thesis

Country:China

Candidate:S F Zhang

Full Text:PDF

GTID:2428330572982437

Subject:Pattern Recognition and Intelligent Systems

Abstract/Summary:

PDF Full Text Request

With the engulfment of deep learning,deep speech recognition technology has gradually replaced the traditional speech recognition model based on GMM-HMM and has become the mainstream in the field of speech recognition.The Air Traffic Control(ATC)speech is the main form of communication between air traffic controllers and pilots.The ATC speech recognition plays an important role in the traffic control system and the real-time monitoring system which is for the talk between air and land.We have carried out series of researches on the method of ATC speech recognition.First of all,we design and implement the ATC speech recognition system with only a few annotations.Considering the characteristcs of insufficient samples,we use syllables as the modeling unit,and construct the acoustic model based on BLSTM(Bidirectional Long Short Term Memory)+CTC(Connectionist Temporal Classification).Then,we utilize Transformer as a language model which is to convert syllable to word.Experiment has shown that the system achieves an acceptable recognition effect.What's more,for the ATC speech data with a large number of annotations,we design and implement two acoustic models,which are FC-N-BLSTM+CTC(Fully Connected layer,FC,N refers to the number of FC layer)and DFCNN+CTC(Deep Fully Convolutional Neural Network)respectively.FC-3-BLSTM+CTC achie-ved a 9.6%character error rate,but the training time and decoding time of BLSTM method are relatively long.The character error rate of DFCNN+CTC was 0.5%higher than that of BLSTM+CTC,but its training time and decoding time are better than BLSTM+CTC,which avoids the problem of training and decoding time-consuming caused by BLSTM.Finally,we complete the bad label selection task of X-ATC based on FC-3-BLSTM+CTC and DFCNN+CTC,which is in order to solve the time-consuming problem of manually selecting bad samples.Then we use the corrected data to retrain the acoustic models,and finally a better result is obtained.

Keywords/Search Tags:

ATC speech recognition, Deep Learning, Acoustic model, Language model

PDF Full Text Request

Related items

1	Research On Uyghur Speech Recognition Based On Deep Learning
2	Research On Adaptation Methods In Deep Learning Based Speech Recognition Systems
3	Air Traffic Control Speech Recognition Based On Deep Learning
4	Development Of Offline Speech Recognition System Based On Deep Learning
5	Research On Acoustic Model Of Speech Recognition In Educational Scene Based On Deep Learning
6	Research On Embedded Speech Recognition System Based On Deep Learning
7	Design And Implementation Of Intelligent Speech Interaction
8	Research And Implementation Of Mongolian-Chinese Mixed Language Speech Recognition System Based On Deep Learning
9	A Study On The Extraction Of Speech Depth In Tibetan Language And Its Speech Recognition
10	Research On Amdo Tibetan Speech Recognition Technology Based On Deep Learning