Font Size: a A A

Speech Enhancement Based On Multi-band Excitation Model

Posted on:2019-02-08Degree:MasterType:Thesis
Country:ChinaCandidate:Q Z HuangFull Text:PDF
GTID:2428330593450595Subject:Information and Communication Engineering
Abstract/Summary:PDF Full Text Request
In recent years,the speech enhancement method based on prior knowledge has become a research hotspot,and it can obtain a good enhancement effect for nonstationary noise.The codebook-driven method is one of the representatives.In the codebook-driven method,the autoregressive(AR)coefficients and excitation variances of speech and noise are considered as random variables,and the probability of all code vectors is estimated using a minimum mean squared error(MMSE)estimation criterion.A weighted combination is used as an estimation parameter,and finally a Wiener filter is constructed using the obtained parameters for speech enhancement.However,this method still has problems such as inability to remove noise between harmonics and the issue of noise classification.For this reason,this thesis proposes a corresponding solution.The main contributions of this thesis include the following two parts:(1)Firstly,multi-band excitation(MBE)speech coding technology has been deeply studied.On this basis,combined with the codebook-driven method,this thesis proposes a codebook-based MBE speech enhancement method.Firstly,the noisy speech is pre-enhanced by the Bayesian codebook-based method,and then three MBE model parameters are extracted for each frame of the pre-enhanced spectrum,including pitch period,harmonic magnitude and voiced/unvoiced(V/UV)decision for each band.Then,according to the V/UV decision,different strategies are adopted to synthesize voiced and unvoiced components.In order to make the MBE model feature parameters more accurate,this thesis also introduces the speech presence probability at the codebook-based method to modify the Wiener filter.It can be known through experiments that the proposed method can improve the quality of speech and remove the noise between harmonics as well as silenced segments.(2)Aiming at the problem of inaccurate parameter estimation of MBE model,this thesis proposes a MBE speech enhancement method based on deep neural network(DNN).According to DNN theory,two DNN models are trained offline to estimate the two parameters of MBE model,including harmonic magnitude and band error function.In this thesis,the linear interpolation method deals with the harmonic magnitude as training target of the neural network.For the error function of each frequency band,the entire frequency band is divided into eight parts and the error function is extracted for each part as the training target.For the pitch period,the noisy speech is pre-processed by the multi-band spectral subtraction method and then the pitch period is extracted by the MBE analysis method.Experimental results show that the proposed method can improve the quality and intelligibility of speech.In this thesis,the spectrogram,Perceptual Quality of Speech(PESQ),Segment Signal to Noise Ratio(SSNR),Log-Spectral Distortion(LSD),and Short-Time Objective Intelligibility(STOI)are used to test the performance of the proposed speech enhancement method.Experimental results show that the proposed methods perform better compared with reference methods.
Keywords/Search Tags:Speech enhancement, Multi-band excitation model, Codebook driving, Deep neural network, Parameter estimation
PDF Full Text Request
Related items