Font Size: a A A

Research On Multi-Channel Speech Enhancement And Post-Processing Technology For Teleconference System

Posted on:2023-10-11Degree:MasterType:Thesis
Country:ChinaCandidate:B Q ChenFull Text:PDF
GTID:2568307031492184Subject:Electronic and communication engineering
Abstract/Summary:PDF Full Text Request
With the rapid development of wireless voice communication technology,people’s demand for voice interaction experience is gradually increasing.Especially during the global outbreak of the covid-19 in recent years,most people’s work and study are implemented through remote conference systems.During a call,the speech signal is often interfered by noises in the sound field environment,resulting in reduced speech quality and intelligibility.Microphone array technology is a mainstream speech enhancement technology.Compared with single-channel speech enhancement algorithm,it can utilize spatial information,so as to have a better suppression effect on noise in undesired directions.Therefore,in order to alleviate the voice call quality problem of the remote conference systems in the noise sound field environment,this thesis mainly focuses on the multichannel speech enhancement and post-filtering algorithms for the remote conference systems.Firstly,speech signal in a desired direction can be acquired and interference noises in other directions can be suppressed through beamforming.Aiming at the problems of low noise suppression efficiency and speech distortion in the current beamforming algorithms,this thesis proposes an improved generalized sidelobe canceller based on the combination of multi-directional mainlobe speech activity detection and signal-to-interference ratio.The results of the speech detection and the signal-to-interference ratio can be utilized to jointly control the update of the adaptive filter coefficients in the adaptive blocking matrix and the adaptive noise canceller,thereby improving the robustness of the multi-channel speech enhancement algorithm of the remote conference systems in the adverse sound field environment.The experimental results show that the proposed improved algorithm can improve the noise suppression ability while protecting the speech information,and finally reduce the distortion of the speech.Secondly,there is still some residual interference noise in the enhanced speech by the microphone array.Aiming at this common problem,this thesis improves a low-parameter real-time noise reduction neural network based on recurrent neural network.The method utilizes Mel sub-band energy features as the main input of the model,and reduces the difference of energy features in different frequency bands by using sub-band filter coefficients normalization and sub-band feature normalization methods.At the same time,the combination of time-frequency domain loss functions is used to promote the learning efficiency of network parameters.The improved optimal spectral amplitude estimator is combined in the network structure to alleviate the problem of pseudo-stationary noises residual between speech harmonics caused by sub-band division in the neural network.The experimental results show that the proposed improved algorithm can further improve the quality and intelligibility of speech.Finally,the effectiveness of the proposed improved algorithms are verified by experimental simulations.The improved beamforming algorithm can not only effectively suppress the interference noise in the undesired direction,but also reduce the distortion of the speech in the desired direction.For the residual interference noise after processing,the improved post-filtering algorithm has better real-time noise suppression performance.Finally,it is verified that the whole cascaded speech enhancement system for the remote conference systems has stronger robustness in adverse sound field environment.
Keywords/Search Tags:speech enhancement, generalized sidelobe canceller, optimal spectral amplitude estimator, neural network
PDF Full Text Request
Related items