Font Size: a A A

Research On Speech Denoising Method Based On Deep Learning

Posted on:2022-01-15Degree:MasterType:Thesis
Country:ChinaCandidate:R LiFull Text:PDF
GTID:2518306344452134Subject:Automation Technology
Abstract/Summary:PDF Full Text Request
Speech is not only the main carrier of information transmission and emotional expression between people,but also the main way of command transmission between people and smart devices.In the process of information transmission,suppressing background noise and improving speech quality and intelligibility have become the main purpose of speech denoising.The traditional speech denoising method needs to assume the distribution of the speech and estimate the energy information in the noise signal,which is effective for stationary noise but poor for non-stationary noise.With the development of artificial intelligence,deep learning have been applied to various aspects and have achieved good results,deep speech denoising methods can effectively suppress non-stationary noise,and have broad prospects in practical applications.Therefore,this thesis will study based on deep learning method of speech denoising.In order to improve the quality and intelligibility of the denoised speech,this thesis proposes two speech denoising models:Attention Res-UNetGAN model and Res-SA Wave-U-Net model.Both of these models denoise speech in the waveform domain,do not need operations such as Fourier transform,are simple to operate and can make full use of the phase information of the speech signal,and are end-to-end speech denoising methods.Experiments show that the models proposed in this thesis improves the quality and intelligibility of the speech after denoising.The main work of this thesis is as follows:(1)This thesis proposes a speech denoising model:Attention Res-UNetGAN.In view of the problems of the generative adversarial network only using convolutional cascade,there is insufficient ability to express the underlying information of the voice signal and the use of direct jumps to cause feature redundancy.The Attention Res-UNetGAN solves these problems through residual-residual block and attention skip.The Residual-Residual block connects two basic residual blocks with residuals,and improves the network's ability to model speech by increasing the nonlinear path in the network.The symmetrical up-sampling layer and the down-sampling layer are combined with attention skip and direct skip for feature splicing,which effectively reduces the computational burden of the model,retains more feature information,and strengthens the model's gradient transfer ability.Finally,evaluated on the VCTK data set,and then evaluated by PESQ and other voice objective quality evaluation indicators.The Attention Res-UNetGAN model improves the voice quality and intelligibility after denoising.(2)This thesis proposes a speech denoising model:Res-SA Wave-U-Net.In view of the two shortcomings of the Wave-U-Net denoising model:the upper and lower sampling blocks only use convolution operations to conceal the long-distance dependence of the voice signal,and there are feature differences when the upper and lower sampling blocks are used for feature splicing.The Res-SA block and the ResPath are proposed and integrated into the Wave-U-Net network to form a new denoising model-Res-SA Wave-U-Net.The Res-SA block is composed of a residual block and a self-attention block fused with 1-D non-causal dilated convolution.1-D non-causal dilated convolution increases the receptive field of the network,and the speech temporal features learned by the self-attention block enhance the robustness of the model.The ResPath effectively reduces the semantic gap in feature splicing.Finally,the effectiveness of the Res-SA block and the ResPath and the denoising ability of the Res-SA Wave-U-Net model are verified on the VCTK,and the PESQ,CSIG,CBAK,and COVL of the Res-SA Wave-U-Net are 2.55,3.85,3.29,3.19,respectively,which are 6.25%,9.38%,1.54%,and 7.77%higher than Wave-U-Net.
Keywords/Search Tags:Deep learning, Speech denoising, Residual structure, Attention mechanism
PDF Full Text Request
Related items