Font Size: a A A

Dural Microphone Speech Enhancement Based On Deep Learning And Beamforming

Posted on:2020-07-11Degree:MasterType:Thesis
Country:ChinaCandidate:L H CuiFull Text:PDF
GTID:2518306518963129Subject:Computer technology
Abstract/Summary:PDF Full Text Request
Speech interaction is the most direct and natural communication method in human society.As one of the key technologies,speech recognition can transform speech signals into corresponding texts by recognizing speech signals.After years of deep research,Automatic Speech Recognition(ASR)has made major breakthroughs and has been put into practical applications.However,there are still some technical problems which need to be overcome.The most important problem is noise reduction.In practical applications,due to the uncertainty of the surrounding environment,speech is often affected by environmental noise,which affects the quality of speech,and ultimately making the speech recognition rate significantly reduced.Therefore,it is of great significance for the application of speech technology in actual production and life to use speech enhancement technologies to suppress noise,eliminate reverberation and improve the accuracy of speech recognition in complex environments.In this paper,we propose three speech enhancement algorithms which combine beamforming method and deep learning algorithms,and finally we use the Lattice combination method to fuse the acoustic models trained by the three enhancement algorithms.Firstly,delay&sum beamforming is performed on the twochannel speech signals,and the in-phase speech signals are added to achieve speech enhancement.Different from the traditional Deep Neural Network,we first propose a speech enhancement algorithm based on attention-driven recurrent convolutional neural network,which uses CNN to extract deep features and uses attention mechanism to distinguish the contribution of different frames.Secondly,in order to make up for the lost local information in the CNN network,the joint operation in the U-Net network is used to achieve the fusion of low level features and deep features.Then in order to solve the problem of mismatch between training set and test set,the reverberation-related self-attention mechanism speech enhancement algorithm is further proposed,and we use WPE to estimate the noise information.Finally,we use Lattice combination method,combining the acoustic models trained by the above three algorithms to obtain a new acoustic model.The experiments in this paper use the corpus provided by the REVERB 2014 Challenge to verify the validity.In the speech recognition task,the word error rate(WER)is relatively reduced by 27.38% in the development set,and the WER in the validation set is relatively reduced by 24.92%.
Keywords/Search Tags:Speech enhancement, Beamforming, CNN, Attention mechanism, Lattice combination
PDF Full Text Request
Related items