Font Size: a A A

The Study Of Speech Enhancement Technology For Farfield Speech Recognition System

Posted on:2020-03-15Degree:MasterType:Thesis
Country:ChinaCandidate:X F ShuFull Text:PDF
GTID:2428330590971534Subject:Information and Communication Engineering
Abstract/Summary:PDF Full Text Request
In the far-field speech recognition system,the speech signals received by the microphone will be degraded by the environmental noise plus interference,speech-like noise interference and reverberation interference,which cause a performance drop on the far-field speech recognition.Therefore,in recent years,study on efficient speech enhancement algorithms,including multi-channel speech dereverberation algorithms,multi-channel beamforming algorithms and single-channel speech enhancement algorithms,has become very hot in the field of speech signal processing.For reverberation interference,the most commonly used method is the Multi-Channel Linear Prediction(MCLP)approach.For speech-like noise interferences,beamforming algorithms are mainly used to suppress them which lie outside the specified direction.For environmental noise,conventional signal processing-based single-channel speech enhancement algorithms were usually used in the past,such as Wiener filtering.With the development of deep learning technology in speech processing,the methods based on Deep Neural Network(DNN)are now applied to speech enhancement.Since Generalized Sidelobe Canceller(GSC)is mainly applied to beamforming,it is not formulated in this work,but MCLP adaptive dereverberation algorithm and single channel speech enhancement algorithm are studied in this thesis.The organization of this thesis is summarized as follows.Firstly,it can be noticed that the Recursive Least Squares(RLS)algorithm is theoretically numerically unstable.The prototype RLS-based MCLP adaptive dereverberation algorithm is thus improved and a QR-decomposition Recursive Least Squares(QR-RLS)based MCLP adaptive dereverberation algorithm is proposed.It has the same dereverberation performance as but better numerical stability over the prototype algorithm.Then the MCLP adaptive dereverberation algorithm based on QR-RLS method is extended to the Variable Forgetting Factor QR-decomposition Recursive Least Squares(VFFQR-RLS)algorithm.It can select the appropriate forgetting factor according to the change of the coefficient vector so that the algorithm can achieve a better balance between convergence and minimum mean square error(MMSE).The simulation results show that the two improved MCLP adaptive dereverberation algorithms have good dereverberation performance and numerical stability under different reverberation levels.Secondly,for the single-channel speech enhancement algorithm based on DNN,a new method based on Progressive Deep Neural Networks(PDNNs)or Progressive Long ShortTerm Memory Networks(PLSTMs)is proposed in this thesis to solve the problem of performance degradation in low Signal-to-Noise Ratio(SNR)environments.The overall enhanced task is decomposed into multiple subtasks by using this method,and the previously completed subtasks provide prior knowledge for the subsequent subtasks so that the latter can better learn their targets.For the learning targets,the acoustic characteristics based on signal-to-noise ratio are proposed in this thesis.The simulation results show that the performance of the proposed single-channel speech enhancement algorithm based on PDNNs and PLSTMs is significantly improved compared with the original one based on DNN and LSTMs(Long Short-Term Memory Networks,LSTMs),including the generalization ability in low SNR environments,which reduces the speech distortion and enhances noise suppression.Finally,a speech enhancement framework is proposed in this thesis for the far-field speech recognition system,including the wiener filter pre-processing module,the speech de-reverberation module,the beamforming module and the single-channel speech enhancement module.The simulation results show that the proposed speech enhancement framework can effectively suppress the interference existing in the far-field speech recognition system and has a significant improvement on both the speech quality and speech intelligibility.
Keywords/Search Tags:speech dereverberation, single-channel speech enhancement, multi-channel linear prediction, progressive long short-term memory networks
PDF Full Text Request
Related items