Luo Ping Dialect Speech Recognition Research Based On Kaldi

Posted on:2019-09-06

Degree:Master

Type:Thesis

Country:China

Candidate:B Zhang

Full Text:PDF

GTID:2428330548468878

Subject:Communication and Information System

Abstract/Summary:

PDF Full Text Request

Speech recognition is the main method of human-computer interaction.In recent years,with the development of science and technology,speech recognition began to be applied to all aspects of life.However,the current Chinese speech recognition systems are based on Putonghua.There are many dialects in such a broad and ethnically diverse country as China.The speech recognition system based on Putonghua is far from meeting the needs of the public,so the research and applications of regional dialects are particularly necessary.This thesis briefly introduces the development history of speech recognition,expounds the basic principle of speech recognition technology,analyzes the significance of each technology of speech recognition to the development of speech recognition.The related technologies involved in the whole process from the collection of original analog speech signals to the construction of language models and acoustic models are studied.In the following discussion,we focused on the acoustic model of speech recognition.the acoustic models studied in this thesis include:Mono-phone model,Triphone model,Optimized Triphone model,Hidden markov model(HMM)and Deep neural network model(DNN).The language model of speech recognition is also studied in this thesis,we mainly study the N-gram model based on statistics.Finally,this thesis analyzes the characteristics of Luo ping dialect and builds a speech recognition system based on Kaldi.In this thesis,five sets of comparative experiments were set up,and the accurate performance of the system was compared between different acoustic models,different language models and different training samples.The experimental results show that in the six different acoustic modelsthe accuracy rate of the acoustic model based on DNN is the highest,up to 96.82%,and the experimental result of the bigram model was better than the unigram model.In the experiment of binary grammar model,with the increase of the training data sample from 1980 to 2420,the accuracy of the system identification was improved continuously,which indicates that the larger the training sample data is,the higher the accuracy of the system identification.On this basis,the training samples and tests were adjusted.The results show that the system has good self-adaptability.

Keywords/Search Tags:

Luo ping dialect, DNN, Kaldi, Speech recognition

PDF Full Text Request

Related items

1	Research Of Automatic Speech Recognition Of The Asante-Twi Dialect For Translation
2	Research Of Speech Recognition Based On Kaldi
3	Research On Chinese Speech Recognition Based On Kaldi
4	Research And Design Of Industrial Voice Command Recognition Based On Machine Learnin
5	Research On Speech Recognition Based On Kaldi
6	Research On Chinese Speech Recognition Based On Kaldi
7	Application Research Of Deep Learning In Speech Recognition Of Sichuan Dialect
8	Speech Recognition Of Hainan Dialect Based On Deep Learning
9	Speech Enhancement Method Fortibetan Speech Recognition In Lhasa Dialect
10	Research On Yangzhou Dialect Speech Recognition Based On Isolated Words