Font Size: a A A

Research On End-to-end Speech Recognition Algorithm Based On Language Model In Noisy Environmen

Posted on:2024-07-23Degree:MasterType:Thesis
Country:ChinaCandidate:W ShengFull Text:PDF
GTID:2568306920975029Subject:Information and Communication Engineering
Abstract/Summary:PDF Full Text Request
Speech recognition is an increasingly widely used technology that has freed people from human-computer interaction in some fields,but the large impact of noise on speech recognition systems has caused the application of speech recognition technology to enter a bottleneck period.In order to solve the problem of degraded performance and low recognition rate of speech recognition system under noisy environment,the research of this paper is as follows:First,this paper proposed a speech noise reduction algorithm CA-DCDCCRN based on coordinated attention deep complex densely connected convolutional recurrent network,which used dense convolution to replace standard convolution to enhance the depth supervision and feature reuse capability of the noise reduction network,and then introduced a coordinated attention mechanism to enable the mobile network to focus on large regions and assign different attention weights to different feature channels,so as to extract the detailed information of the noisy speech spectral map.Secondly,this paper proposed the transform-based lightweight language model LLMT.The core of LLMT model was to use the weight calculation method based on weight transformation and Hadamard matrix to realize the reuse of weights and solved the problem of singularity of shared parameters,while the Hadamard matrix calculation solved the problem of unused weight of partial multi-headed attention and improved the codec of the model speed.In addition,this paper designed a feature-compensated lightweight feedforward network,which integrated features by means of dimensional enhancement and reduction operations,reduced the computation of network parameters,and used feature compensation to ensure the performance of the feedforward network.Finally,based on the first two research points,this paper proposed the end-to-end noisy speech recognition algorithm ENSRILLM that fused lightweight language models.This paper also constructed ENSRILLM-S and ENSRILLM-L noisy speech recognition models according to different language model fusion methods,and experiments are conducted on Aishell-1,Thchs-30,Aidatatang,and Magicdata datasets to verify the effectiveness of the algorithms in this paper.
Keywords/Search Tags:Coordinated attention, Speech noise reduction, Weight calculation, Lightweight language models, End-to-end noisy speech recognition
PDF Full Text Request
Related items