Font Size: a A A

Research On Speech Enhancement Algorithms Of Microphone Array Based On Time-Frequency Masking

Posted on:2021-09-26Degree:MasterType:Thesis
Country:ChinaCandidate:P J CuiFull Text:PDF
GTID:2518306050454414Subject:Communication and Information System
Abstract/Summary:PDF Full Text Request
As the carrier of information,speech plays a crucial role in various scenes of life,such as intelligent driving,smart home and video conferencing.However,in the real life,the received speech is accompanied by various noises,and the quality and intelligibility of the speech are greatly affected by these noises.Therefore,the speech enhancement algorithm is needed to remove the noise in the noisy speech,so that the quality and intelligibility of the speech can be improved.The research on microphone array speech enhancement of this thesis is divided into three parts: speech distortion weighted multichannel wiener filter(SDW-MWF)based on time-frequency masking,post filtering based on masking and improved SDW-MWF algorithm based on time-frequency masking.Prior information in microphone array and sound source is necessary for the beamforming algorithm to speech enhancement.Once the prior information is deviated,the denoising performance of the algorithm will be seriously affected.The noise covariance matrix and the expected speech covariance matrix can be estimated for the SDW-MWF algorithm adopted in this thesis without the prior information,and then the microphone array speech enhancement weight can be solved.Traditional noise covariance estimation algorithms,such as voice activity detection(VAD)and speech presence probability(SPP),may make the noise covariance estimation biased,resulting in distortion of the output speech or a large amount of residual noise.Time-frequency masking is used in this thesis to calculate the noise covariance matrix.By assuming that the observed signal model meets complex Gaussian mixture model(CGMM),the probability of each time-frequency unit being noise can be calculated by the Expectation-Maximization(EM)algorithm and the noise covariance matrix can be estimated by the probability.Applying this method to the SDW-MWF algorithm,good results have been achieved in terms of denoising.Experiments show that the PESQ and STOI values of the output speech are improved by the enhanced algorithm proposed in this thesis.The effect of SDW-MWF algorithm on speech enhancement is limited,so in this thesis,the post-filtering technology is combined with SDW-MWF filter to further eliminate the residual noise.The post-filtering algorithms used in this thesis are the ideal binary mask(IBM)and ideal ratio mask(IRM)algorithms,so time-domain noise is required as input.Therefore,a multi-channel noise estimation algorithm based on multichannel wiener filter(MWF)is used to estimate noise.By learning the way of solving the expected signal in the MWF algorithm and replacing the expected signal in the MWF algorithm with noise,the estimation of the noise are realized.Finally,the output of SDW-MWF based on time-frequency masking and the result of the noise estimation are used as the input of the post-filtering to suppress the noise.Experiments show that the IBM post-filtering causes distortion of the output speech while the noise is further removed,while the noise in the output speech of the IRM postfiltering is further suppressed,and the voice quality is improved.In most cases,there is only target one sound source for microphone array speech enhancement.Therefore,this thesis analyzes the performance and shortcomings of the SDWMWF algorithm under the premise that there is only one target sound source signal,and proposes an improved algorithm.This improved algorithm is more simplified than SDWMWF and it have higher quality output speech.Denoising performance of improved SDWMWF algorithm based on time-frequency masking and SDW-MWF algorithm is compared.The experiments prove that the output speech quality obtained by the improved algorithm is better.The microphone array speech enhancement algorithm based on the six-mic uniform linear array used in this thesis can achieve good speech enhancement effects in near and far field situations and various noises.It can be used in video conference systems or other scenarios where speech enhancement is required for noise suppression.
Keywords/Search Tags:Speech Enhancement, Microphone Array, Time-frequency Masking, Speech Distortion Weighted Multichannel Wiener Filter(SDW-MWF), Post-filtering
PDF Full Text Request
Related items