Font Size: a A A

Research On Speech Dereverberation Based On Improved Wasserstein Generative Adversarial Networks

Posted on:2022-07-02Degree:MasterType:Thesis
Country:ChinaCandidate:L F RaoFull Text:PDF
GTID:2518306569972899Subject:Signal and Information Processing
Abstract/Summary:PDF Full Text Request
Speech dereverberation refers to the reduction or elimination of the reverberation formed by the sum of all reflected sound waves from the acoustic signal.Speech dereverberation and speech noise reduction belong to the category of speech enhancement,and almost all intelligent speech systems use speech enhancement as a front-end processing technology.At present,the most researched speech enhancement technology is speech noise reduction.However,speech reverberation exists in almost any enclosed space,and strong reverberation will have a serious negative impact on the signal received by the microphone.With the great increase in computer computing power,deep neural network related methods are widely used in image and voice signal processing.However,most of these methods only consider eliminating noise in the voice signal,ignoring the reverberation that also affects the quality of the voice signal.Phenomenon,resulting in damage to the accuracy of speech recognition and subsequent language processing systems.This paper is based on the improved Wasserstein Generative Adversarial Network to study the speech dereverberation algorithm.The main work is as follows:First,we make a thorough research on the development process and various research results in the field of speech dereverberation,and detailed analysis of the theoretical knowledge of speech dereverberation.After according to the difficulties of speech dereverberation and existing speech dereverberation methods based on deep learning,we confirm the effectiveness of using generative adversarial networks to remove speech reverberation.Second,we propose to apply the Wasserstein Generative Adversarial Network to speech dereverberation.The generator uses the encoder-decoder convolutional neural network structure.In order to retain more speech information,fewer pooling layers are used in the network structure.The discriminator uses the structure of a deep neural network,which can better fit the Wasserstein distance function.Experimental results show that the dereverberation model can effectively remove speech reverberation,especially for severely reverberated speech.Compared with other traditional dereverberation algorithms and deep neural networks,its dereverberation performance is better.Thirdly,we propose to apply Wasserstein Generative Adversarial Network to speech dereverberation.When the original Wasserstein Generative Adversarial Network is processing early reverberant speech,the dereverberation effect is not ideal.The improved model adds a gradient penalty term to the loss function of the discriminator,which can solve the problem of the unreasonable value distribution of the discriminator network parameters,improve the stability of the network,and make the model converge faster during training.Experimental results show that the proposed algorithm has stronger generalization of dereverberation ability,and it can better remove early and late reverberation while ensuring voice quality.
Keywords/Search Tags:speech dereverberation, Generative Adversarial Network, deep learning, Wasserstein distance
PDF Full Text Request
Related items