The Implementation And Optimization Of Speech Recognition System Based On Kaldi

Posted on:2022-08-03

Degree:Master

Type:Thesis

Country:China

Candidate:C Y Li

Full Text:PDF

GTID:2518306731477654

Subject:Computer technology

Abstract/Summary:

PDF Full Text Request

With the widespread application of speech recognition(ASR)technology,there are many speech recognition providers on the market.Due to data sensitivity,cost,and customizability,the need of self-built ASR system is still on the market.However,most of the current ASR frameworks are developed for scientific research purpose,not for production purpose.Besides,current ASR frameworks does not support customized requirements like streaming recognition and word biasing.To solve above problems,this paper studies the development and optimization of telephone ASR system.This paper first introduces the meaning of the topic and the history of ASR.Then,we introduce the basis of ASR included n-gram language model,feature extraction,acoustic model based on hidden Markov chain,WFST decoding algorithm and sequence discriminative training.Then we analyze Kaldi ASR framework from the source code level.Then,we introduce the details of system development.The neural network architecture,language model batch processing,architecture and development of ASR service and telephone call service are introduced in this part.Finally,we introduce two use scenarios of this system.This paper also introduces some engineering and algorithm optimizations to the system.In order to overcome language model mismatch,we propose a decoder hot word biasing algorithm.The algorithm can improve recognition accuracy of hot words defined by user when the language model of the system is mismatch from current use scenario.We implement a streaming recognition decoder,and optimize it using backward pointers to avoid lattice repeated generation.To optimize phone service,we use ping-pong buffer to reduce audio card delay.We test our system on open-source data set Aishell2 and tow business data sets.We analysis and compare the impact of sampling rate on accuracy.Experimental results show that our system achieve relatively good accuracy on the business data set.Our system can meet the needs of self-built phone ASR system.

Keywords/Search Tags:

Speech Recognition, Decoder, Audio Card, Word Biasing

PDF Full Text Request

Related items

1	Research On The WFST Based Chinese Speech Recognition Decoder
2	Key Technology Research On Audio Information Hiding And Information Security Application For Speech Recognition
3	Research And Implementation Of Audio Quality Evaluation And Speech Recognition Preprocessing Technology
4	Speech Endpoint Detection Based On Audio And Visual Features
5	Bimodal Speech Recognition Technology Research Based On Audio And Video
6	Applied Research On Specific Word Chinese Speech Recognition System
7	Research Of Audio Decoding Theory And Design Of Audio Decoder In DRM System
8	How does acoustic variability in speech affect infant word recognition and word learning
9	Audio and video indexing with speech recognition
10	Research On Noise Treatment Of Speech Recognition With Lip-movement Information