Font Size: a A A

Far-Filed Speech Recognition Methods Research Based On Beamforming And DNN

Posted on:2019-01-09Degree:MasterType:Thesis
Country:ChinaCandidate:X D WangFull Text:PDF
GTID:2428330545458763Subject:Communication and Information System
Abstract/Summary:PDF Full Text Request
Speech recognition in near-field scenes has achieved satisfactory results,but far-field speech recognition remains challenging due to high levels of noise and reverberation.Compared with the single-channel microphone,microphone array beamforming has become an important part in the far-field speech intelligence acquisition and speech recognition.Deep neural networks have shown great advantages in the field of speech recognition because of their powerful modeling capabilities.Therefore,far-field speech recognition based on beamforming and deep neural networks has become a hot research topic in recent years.Based on the algorithm of microphone array and deep neural network,this thesis describes the basic theories of far-field speech recognition,the basic process of speech recognition,and analyzes how to use the beamforming to achieve speech enhancement.The two major types of acoustic models used in speech recognition,namely the DNN-HMM acoustic model and the end-to-end acoustic model,are described in detail,as well as the basic algorithm of speech recognition in decoding.On top of this,this thesis does the speech recognition method research in far-field scenes combined with speech enhancement.Aiming at the traditional method of speech enhancement and speech recognition as two independent processes,this thesis presents two improvements.The first one,a far-field speech recognition method based on improved beamformer network is introduced considering that the multichannel cross-correlation coefficient information is more robust in the environments with noise and reverberation.In this method,the MVDR beamformer parameters are estimated by using multichannel cross-correlation coefficient information as the input characteristic of the beamformer network.This method reduces the computational complexity and the time on system training while improving the recognition performance compared with the original algorithm.The second,a far-field speech recognition method based on attention mechanism acoustic model is introduced.This method combines speech enhancement network and speech recognition model as a whole,and extends the existing attention-based framework to far-field scenes.The results show that this method can improve the recognition performance.
Keywords/Search Tags:far-field speech recognition, beamforming, deep neural network, end-to-end, attention mechanism
PDF Full Text Request
Related items