Far-Filed Speech Recognition Methods Research Based On Beamforming And DNN

Posted on:2019-01-09

Degree:Master

Type:Thesis

Country:China

Candidate:X D Wang

Full Text:PDF

GTID:2428330545458763

Subject:Communication and Information System

Abstract/Summary:

Speech recognition in near-field scenes has achieved satisfactory results,but far-field speech recognition remains challenging due to high levels of noise and reverberation.Compared with the single-channel microphone,microphone array beamforming has become an important part in the far-field speech intelligence acquisition and speech recognition.Deep neural networks have shown great advantages in the field of speech recognition because of their powerful modeling capabilities.Therefore,far-field speech recognition based on beamforming and deep neural networks has become a hot research topic in recent years.Based on the algorithm of microphone array and deep neural network,this thesis describes the basic theories of far-field speech recognition,the basic process of speech recognition,and analyzes how to use the beamforming to achieve speech enhancement.The two major types of acoustic models used in speech recognition,namely the DNN-HMM acoustic model and the end-to-end acoustic model,are described in detail,as well as the basic algorithm of speech recognition in decoding.On top of this,this thesis does the speech recognition method research in far-field scenes combined with speech enhancement.Aiming at the traditional method of speech enhancement and speech recognition as two independent processes,this thesis presents two improvements.The first one,a far-field speech recognition method based on improved beamformer network is introduced considering that the multichannel cross-correlation coefficient information is more robust in the environments with noise and reverberation.In this method,the MVDR beamformer parameters are estimated by using multichannel cross-correlation coefficient information as the input characteristic of the beamformer network.This method reduces the computational complexity and the time on system training while improving the recognition performance compared with the original algorithm.The second,a far-field speech recognition method based on attention mechanism acoustic model is introduced.This method combines speech enhancement network and speech recognition model as a whole,and extends the existing attention-based framework to far-field scenes.The results show that this method can improve the recognition performance.

Keywords/Search Tags:

far-field speech recognition, beamforming, deep neural network, end-to-end, attention mechanism

Related items

1	Research On Beamforming Technology Of Deep Learning Far-field Speech Recognition
2	Research On Deep Learning Based Far-Filed Speech Recognition
3	Chinese Speech Recognition Based On Deep Convolutional Neural Network
4	Research On Speech Emotion Recognition Model Based On Deep Neural Network
5	Speech Recognition Front-End Processing Based On Deep Neural Network
6	Speech Recognition Based On Deep Encoder And Decoder
7	Speech Emotion Recognition Based On Deep Learning
8	Deep Learning Models For Speech Emotion Recognition
9	Speech Emotion Recognition Based On Deep Learning Technology
10	Research On Multi-person Speech Recognition Based On Deep Learning