Research On Speech Enhancement Based On Deep Neural Network

Posted on:2020-12-02

Degree:Master

Type:Thesis

Country:China

Candidate:N Li

Full Text:PDF

GTID:2428330620956144

Subject:Information and Communication Engineering

Abstract/Summary:

PDF Full Text Request

Speech enhancement is widely used in the speech signal processing system and artificial intelligence system.In practical environment,traditional speech enhancement algorithms have some problems,such as low enhancement performance and poor generalization performance.In this thesis,based on the perceptual characteristics of the human hearing system and deep learning network structure,a signle channel speech enhancement algorithm is studied based on deep learning network.The algorithms proposed in this thesis mainly include two aspects: speech enhancement algorithm based on multi-resolution cochleagram feature(MRCG)and deep learning network(DNN),speech enhancement algorithm based on spectrogram and condition Generative Adversarial Nets(cGAN).(1)Speech enhancement algorithm based on multi-resolution cochleagram feature and deep learning network.Different from the traditional STFT(Short-Time Fourier Transform),this thesis is based on Gammatone filter,and the MRCG of each time-frequency is extracted as the spectral feature.Two frames befor and after are combined and as input parameters of DNN.Training target is IRM(Ideal Ratio Mask).DNN updates gradient by Root Mean Square Prop(RMSProp)algorithm,which solves the unstable of traditional networks.In this thesis,Perceptual Evaluation of Speech Quality(PESQ)and Short-time Objective Intelligibility(STOI)are used as evaluation indicators.Simulation results show that the performance of this algorithm is superior to that of the traditional algorithm.(2)Speech enhancement algorithm based on spectrogram and condition Generative Adversarial Nets.CGAN is mostly used for image enhancement and recognition at present.In this thesis,a mapping algorithm from noisy spectrogram to enhanced soectrogram based on cGAN is proposed.CGAN uses the original noisy as a condition to input generative network,and trains with U-Net structure,encoder-decoder structure,and adds jump connections between upper and lower sampling layers.In this thesis,STOI and PESQ are used as evaluation index.The simulation results show that in speech enhancement,cGAN can improve the quality of speech separation,and STOI is better than the algorithm based on MRCG.Also,for babble noise,cGAN is more effective than MRCG.In addition,cGAN has good generalization performance in different kinds of noise.

Keywords/Search Tags:

Deep Neural Network, Speech Enhancement, Condition Generative Adversarial Nets

PDF Full Text Request

Related items

1	Research On Speech Enhancement Based On Deep Neural Network
2	Research On Single-Channel Speech Enhancement Based On Generative Adversarial Network
3	Research On Auto-encoders And Generative Adversarial Network Based Speech Enhancement
4	Research On End-to-end Multi-speech Separation Technology Based On Generative Adversarial Nets
5	Single Channel Speech Enhancement Based On Generative Adversarial Networks
6	Speech Enhancement Based On Iterative Mask Estimation And Generative Adversarial Networks
7	Research On Deep Learning Based Speech Enhancement
8	Research On Ultrasound Image Detail Enhancement With Generative Adversarial Nets
9	Signal Reconstruction Based On Generative Adversarial Networks
10	Research And Implementation Of Speech Enhancement Algorithm Based On Deep Learning