Font Size: a A A

Simulation Study On Speech Command Recognition Under Vehicular Noise

Posted on:2019-04-10Degree:MasterType:Thesis
Country:ChinaCandidate:Z Q TuFull Text:PDF
GTID:2428330566986084Subject:Signal and Information Processing
Abstract/Summary:PDF Full Text Request
Speech command can not only facilitate drivers' control over cars in a more natural way but also alleviate distraction from operating vehicle electronic device to improve security of driving.Since the always-on nature of speech command recognition,it is important to research on speech command recognition with the merits of memory/compute resource saving,high recognition accuracy and noise robustness for off-line use in order to reduce the resource consumption and in case of poor even no network connection.For this purpose,the followings are done in this thesis:First,speech command recognition under noise-free circumstance is explored.By considering that CNN is good at local processing and RNN is good at sequential processing,a new neural network named CGRU is proposed,which contains convolutional operation and recurrent operation.On the basis of CGRU,a speech command recognition model is then proposed.Experiment results show that the proposed CGRU model achieves the highest accuracy of 96.65% over the other six models.The achieved accuracy is even higher than the second best 96.53% which the ResCNN model achieves.What's more,multiplies of the proposed CGRU model is only about 1/25 of that of the ResCNN model.Second,monaural speech de-noising based on deep learning is explored.Given that the fbank feature is component of the input of the proposed speech command recognition model and its dimension is much lower than that of FFT spectra,we propose to de-noise the fbank feature directly.Since the variation along time of vehicular noise is slower than that of speech signal,noise information can be better extracted by processing the neighborhood of current signal frame with CNN which is good at local processing.A new de-noising model based on CNN and RNN is then proposed.Experiment results show that the proposed CNN-RNN model can reduce the average MSE by 24% relatively,parameters by 62% and multiplies by 55% comparing to the traditional RNN de-noising model.Finally,speech command recognition under vehicular noise circumstance is explored.Since the noise can't be completely removed from a monaural noisy signal,retraining the proposed speech command recognition model can reduce the mismatch between test data and training data after combining the proposed de-noising model and recognition model.Two ways of retraining the recognition model are explored respectively,which are the one retraining by random initialization and the one retraining by using the current values of the model parameters as a start.Experiment results show that retraining the recognition model by the second way can improve noise robustness of recognition model and achieve the best performance.A 94.94% accuracy can be achieved even under vehicular noise with-15 dB SNR.What's more,a 96.40% accuracy averagely over several different SNRs can be achieved,which is only lower than the accuracy under noise-free circumstance by 0.25%.
Keywords/Search Tags:speech command recognition, vehicular noise, deep learning, convolutional gated recurrent unit, monaural speech de-noising
PDF Full Text Request
Related items