Font Size: a A A

Research Of Speech Enhancement Based On Deep Convolution Generation Adversarial Networks

Posted on:2021-03-08Degree:MasterType:Thesis
Country:ChinaCandidate:X N ZhangFull Text:PDF
GTID:2428330626463686Subject:Software engineering
Abstract/Summary:PDF Full Text Request
Speech is an important way of communication between people,but in reality,speech is usually interfered by noise,so it is a research focus to extract near pure speech from noisy environment.Speech enhancement is an important field of speech signal processing,which is used to solve the problem of speech noise reduction,and has been applied in communication and other fields.The traditional speech enhancement method based on statistical model needs to assume that the speech and noise obey a certain distribution and estimate the energy information of the noise.However,the noise has diversity and instability,which leads to the incomplete speech noise reduction and the phenomenon of residual noise or music noise,so it is not ideal.With the rise of deep learning,a variety of neural network models can well learn the nonlinear relationship between data and apply to many fields,among which the Generative Adversarial Networks(GAN)has become the most popular network model since it was proposed,and has achieved good results in image processing,but it is still in the primary stage of speech enhancement.The speech enhancement method based on the common neural network often ignores the phase problem,while the speech enhancement method based on the generated antagonism network reduces the noise,but the speech after noise reduction often loses part of the information,and the high-frequency part of the recovery ability is low.At the same time,the speech intelligibility and noise reduction ability need to be improved,resulting in the noise reduction effect is still not ideal.In order to solve the above problems and improve the noise reduction effect,this paper proposes a network model based on Deep Convolution Generative Adversarial Networks(DCGAN)for speech enhancement.The following is the main work of this paper:In view of the poor noise reduction effect of traditional methods,and strengthen the speech recovery ability and noise reduction effect based on the generation countermeasure network enhancement method,this paper uses the deep convolution countermeasure network modeling to denoise the speech,extracts the effective features through the multi-layer convolution operation and improves the speech recovery ability.The generator of GAN is improved with U-net structure and jumpconnection is added to avoid the problem of information loss due to the too deep network layer,retain more detailed information with more features,and enhance the effect of voice noise reduction.Aiming at the problem that DCGAN based speech enhancement method leaves part of noise in low SNR,which leads to low SNR of segmented speech,this paper proposes a speech enhancement method based on WDCGAN,which uses Wasserstein distance instead of cross entropy as loss function to enhance network learning ability and further reduce noise information,so as to improve the segmented SNR of speech and realize speech enhancement.
Keywords/Search Tags:Speech Enhancement, Deep Convolution Generative Adversarial Networks, Wasserstein distance
PDF Full Text Request
Related items