Font Size: a A A

Speech Enhancement Based On Speech Phase Estimation And Sound Source Spatial Feature

Posted on:2021-12-31Degree:MasterType:Thesis
Country:ChinaCandidate:R ChengFull Text:PDF
GTID:2518306470969169Subject:Information and Communication Engineering
Abstract/Summary:PDF Full Text Request
Due to the existence of noise and reverberation,the speech quality of various human-machine interaction systems will be seriously affected.In order to reduce this kind of effect,it is necessary to enhance the collected speech signal.For conventional monaural speech enhancement methods,the magnitude spectrum of noisy speech is usually enhanced,but the phase spectrum enhancement is ignored,which limits its speech enhancement performance in complex scenes with lower signal-to-noise ratio(SNR).For conventional multi-channel speech enhancement methods,the utilization of the spatial information of sound source and the phase information of speech is not enough,so there are still some errors in the estimation of target sound source.Its capacity of suppressing interference and noise needs to be improved.In order to solve these problems,three speech enhancement methods based on speech phase estimation and sound source spatial feature are proposed in this thesis.On this basis,a multichannel speech coding and enhancement method based on the sound source spatial feature is also proposed.Firstly,in order to solve the problem that most speech enhancement methods only enhance the magnitude spectrum of noisy speech and ignore the phase spectrum enhancement,a monaural speech enhancement method based on deep neural network(DNN)and phase correction function is proposed.In this method,the DNN and the phase correction function are used to simultaneously enhance the magnitude spectrum and the phase spectrum of the noisy speech,thus improving the speech enhancement performance under the condition of low SNR.Secondly,in order to solve the problem that neural network can not be directly used in phase spectrum enhancement,a monaural speech enhancement method based on the DNN and phase unwrapping is proposed.The estimation of phase spectrum is realized by neural network through the phase unwrapping method based on cellularautomata.Combined with magnitude spectrum enhancement method,the speech enhancement performance under the condition of low SNR is improved.Thirdly,in order to improve the perceptual ability of the speech enhancement method to the direction of the target sound source and the noise suppression ability in complex scene,a multi-channel speech enhancement method based on the sound source spatial feature and speech phase information is proposed.In this method,the interchannel phase differences and the phase-sensitive masks are combined to construct the improved beamformer,and the monaural post filtering based on the DNN and phase correction function is used to improve the speech enhancement performance in complex scene.Finally,in order to explore the application of sound source spatial feature in multi-channel speech coding and enhancement,a multi-channel speech coding and enhancement method based on the sound source spatial feature is proposed.In this method,the time delay estimation between the uniform linear microphone array and the Enhanced Voice Services(EVS)codec are used to realize multi-channel speech codec.Combined with the proposed multi-channel speech enhancement method based on the sound source spatial feature and speech phase information,the extraction and enhancement of the target sound source at the decoding end are realized.
Keywords/Search Tags:Speech enhancement, Speech coding, Deep neural network, Speech Phase Estimation, Sound source spatial feature
PDF Full Text Request
Related items