Font Size: a A A

Research On Supervised Speech Enhancement Based On Deep Neural Networks

Posted on:2020-11-28Degree:MasterType:Thesis
Country:ChinaCandidate:S R BaiFull Text:PDF
GTID:2428330572471555Subject:Systems analysis and integration
Abstract/Summary:PDF Full Text Request
With the development of artificial intelligence,speech has become an important way of human-computer interaction.The noise,which is complex and changeable in real life,seriously affects the accuracy of speech recognition systems and re-duces user experience.The main purpose of speech enhancement technology is to restrain noise in noisy speech to improve the perceptual quality and intelligibility of speech,and then make human-computer interaction more efficient.Tradi-tional speech enhancement methods mostly rely on noise estimation to reduce noise,which is simple and easy to implement.However,the traditional speech enhancement methods are difficult to effectively deal with non-stationary noise,and so have limitations in practical applications.Supervised speech enhancement methods based on deep learning can automatically learn the nonlinear mapping relationship between noisy speech features and desired targets.Compared with the traditional methods,the supervised methods can,to a certain extent,im-prove the quality of speech in low signal-to-noise ratio and non-stationary noise environment.This thesis focuses on the supervised speech enhancement methods based on deep neural networks,and the main work is therein as follows:(i)Gated recurrent unit(GRU)based speech enhancement method is proposed.Deep neural network(DNN)has nonlinear expression ability,but it cannot learn the context information of speech signal well.In this thesis,gates of GRU are used to learn the long short-term information of speech signal,which effectively remedies the shortcomings of DNN model.More importantly,the proposed GRU model has fewer training parameters than the long short-term memory(LST-M)model,and can improve the training speed of the model while ensuring the memory ability of the neurons.In this thesis,a series of experiments are carried out to validate GRU-based speech enhancement method in matched noise test set,unmatched noise test set and unmatched signal-to-noise ratio test set.The experimental results show that the training speed of GRU model is faster than that of LSTM model,and that GRU model has comparable performance com-pared with LSTM model.Enhanced speech of GRU model and that of LSTM model have comparable perceptual quality and objective intelligibility.Finally,the effectiveness of GRU-based speech enhancement method is further verified in real noise test set.(ii)By combining convolutional neural network with gated recurrent unit neu-ral network,CNN-GRU based speech enhancement method is proposed.CNN has powerful feature learning ability.It can excavate structural information hid-den in data.GRU is a gated recurrent neural network and it can learn long-term information.In this thesis,convolutional layer is used to learn local features of speech signal,and then gated recurrent layer is connected behind convolutional layer to learn correlation between local features in different time periods.Finally,the fully connected layers are used to learn the nonlinear mapping relationship be-tween speech features and ideal targets.Therefore the time-frequency correlation of speech signal can be fully used,The CNN-GRU based speech enhancement model is verified in matched noise test set,unmatched noise test set,unmatched signal-to-noise ratio test set and real noise test set,and compares with CNN mod-el and GRU model.The experimental results show that the CNN-GRU model has better generalization ability and enhanced speech of CNN-GRU model has better quality than that of CNN model and GRU model.At the end of this thesis,we summary the research work and make some plans for future work.
Keywords/Search Tags:supervised speech enhancement, gated recurrent unit neural network, convolutional neural network, generalization ability, perceptual quality, objective intelligibility
PDF Full Text Request
Related items