Font Size: a A A

Research On Speech Enhancement Model Based On Conditional Deep Convolutional Generative Adversarial Networks

Posted on:2020-07-27Degree:MasterType:Thesis
Country:ChinaCandidate:W ChuFull Text:PDF
GTID:2428330590452611Subject:Control Engineering
Abstract/Summary:PDF Full Text Request
Voice interaction technology is increasingly used in real life,such as automatic speech recognition system,smart home,mobile voice communication.Due to the existence of interference,the voice interaction technology in the real environment is far from satisfactory.Speech enhancement is an effective way to improve the performance of voice interaction.After describing the relevant background and research status of speech enhancement,this paper proposes a speech enhancement model based on conditional deep convolutional generative adversarial networks(C-DCGAN)for some characteristics of speech and noise.The model of C-DCGAN adds convolutional layer and conditional information on the basis of GAN.C-DCGAN uses the convolutional layer in the generator to automatically extract speech features,and generates speech samples based on the extracted features.Then the generated speech samples and pure speech samples are inputed into discriminator,at the same time,the discriminator judges whether the generated speech is true or false.Finally,the network model is trained by back propagation to make the generator generate the speech samples as close as possible to the pure speech,and the discriminator can distinguish the real samples and the generated samples.The C-DCGAN model is trained by adversarial game.In the game of confrontation,generator obtains the implicit features of the speech signal,so that the proposed model can be used to output the speech signal close to the pure speech.After the model training is completed,the mixed speech is inputed into the model to generate the enhanced speech,so as to realize the speech enhancement of noisy speech.In order to evaluate the speech enhancement performance of the C-DCGAN model,this paper builds a speech enhancement platform based on the framework of Tensorflow.The experimental comparison and analysis in a variety of noise datasets proved that the C-DCGAN model can enhance the noisy speech and has better generalization ability.Compared with the spectral subtraction and DNN method,PESQ and STOI has improved.The method of this paper provides ideas for future research in this field,which has certain theoretical value and application value.
Keywords/Search Tags:Speech Enhancement, Conditional Deep Convolutional Generative Adversarial Networks, Deep Neural Networks, Conditional Information
PDF Full Text Request
Related items