Research On Speech Enhancement Model Based On Conditional Deep Convolutional Generative Adversarial Networks

Posted on:2020-07-27

Degree:Master

Type:Thesis

Country:China

Candidate:W Chu

Full Text:PDF

GTID:2428330590452611

Subject:Control Engineering

Abstract/Summary:

PDF Full Text Request

Voice interaction technology is increasingly used in real life,such as automatic speech recognition system,smart home,mobile voice communication.Due to the existence of interference,the voice interaction technology in the real environment is far from satisfactory.Speech enhancement is an effective way to improve the performance of voice interaction.After describing the relevant background and research status of speech enhancement,this paper proposes a speech enhancement model based on conditional deep convolutional generative adversarial networks(C-DCGAN)for some characteristics of speech and noise.The model of C-DCGAN adds convolutional layer and conditional information on the basis of GAN.C-DCGAN uses the convolutional layer in the generator to automatically extract speech features,and generates speech samples based on the extracted features.Then the generated speech samples and pure speech samples are inputed into discriminator,at the same time,the discriminator judges whether the generated speech is true or false.Finally,the network model is trained by back propagation to make the generator generate the speech samples as close as possible to the pure speech,and the discriminator can distinguish the real samples and the generated samples.The C-DCGAN model is trained by adversarial game.In the game of confrontation,generator obtains the implicit features of the speech signal,so that the proposed model can be used to output the speech signal close to the pure speech.After the model training is completed,the mixed speech is inputed into the model to generate the enhanced speech,so as to realize the speech enhancement of noisy speech.In order to evaluate the speech enhancement performance of the C-DCGAN model,this paper builds a speech enhancement platform based on the framework of Tensorflow.The experimental comparison and analysis in a variety of noise datasets proved that the C-DCGAN model can enhance the noisy speech and has better generalization ability.Compared with the spectral subtraction and DNN method,PESQ and STOI has improved.The method of this paper provides ideas for future research in this field,which has certain theoretical value and application value.

Keywords/Search Tags:

Speech Enhancement, Conditional Deep Convolutional Generative Adversarial Networks, Deep Neural Networks, Conditional Information

PDF Full Text Request

Related items

1	Study And Application On The Structure Improved Deep Convolutional Generative Adversarial Networks
2	Research On Deep Neural Networks For Multi-focus Image Fusion
3	Single Channel Speech Enhancement Based On Generative Adversarial Networks
4	Research On The Technology And Application Of Image Blind De-blurring Based On Conditional Generative Adversarial Networks
5	Research On Auto-encoders And Generative Adversarial Network Based Speech Enhancement
6	Research On Speech Enhancement Based On Wasserstein Generative Adversarial Networks
7	Research On Generative Adversarial Networks-Based Terrain Mapping
8	Research On Facial Expression Analysis Based On Conditional Generative Adversarial Nets
9	Research Of Speech Enhancement Based On Deep Convolution Generation Adversarial Networks
10	Research And Implementation Of Speech Enhancement Algorithm Based On Deep Learning