Font Size: a A A

Research On Speech Enhancement Algorithm Based On Dilated Convolutional Network

Posted on:2022-01-22Degree:MasterType:Thesis
Country:ChinaCandidate:Y D XuFull Text:PDF
GTID:2518306722967179Subject:Computer technology
Abstract/Summary:PDF Full Text Request
With the development of real-time voice communication today,users have put forward higher and higher requirements for voice quality.However,in a real-life environment,speech signals are subject to interference from various types of noise,which will lead to problems such as a decline in call quality and a significant increase in the error rate of the voice recognition system.Traditional speech enhancement technology starts from the sound signal itself,makes assumptions based on its time-frequency characteristics,and builds models.Many of these parameters need to be manually tuned,which is time-consuming and labor-intensive.With the development of artificial intelligence technology,relying on its powerful fitting ability and using data-driven,it provides more possibilities for speech enhancement technology.In this paper,we use artificial neural network for speech enhancement research.The artificial neural network speech enhancement algorithm is to train the neural network using clean speech and noisy speech to make the network noise resistant for speech enhancement purpose.In this paper,we introduce the related theories of artificial neural networks and convolutional neural networks,and deeply study the structure of dilated convolutional networks and residual networks,and improve the speech enhancement network based on Wavenet and the speech enhancement network based on deep feature losses respectively according to the characteristics of dilated convolutional network,and finally the denoising effect and model convergence speed are improved to a certain extent,the specific work is as follows:(1)In order to improve the denoising effect of the speech enhancement network based on Wavenet,this paper proposes a method that combines the original speech signal and its advanced features,and introduces the Mel Frequency Cepstral Coefficient(MFCC)to improve the model.Experimental results show that this method is effective.Although the denoising speed of the improved model is reduced,the signal-to-noise ratio(SNR)of the test set speech and the convergence speed of the model during training are improved.(2)In order to reduce the training time of the speech enhancement network based on deep feature losses,this paper proposes a method of adding residual connections based on the structure of the speech enhancement network based on Wavenet.Experimental results show that this method is feasible.The improved model not only shortens the training time,but also improves the SNR of the test set speech.
Keywords/Search Tags:Speech Enhancement, Artificial Neural Network, Dilated Convolutional Network, Residual Network
PDF Full Text Request
Related items