Research On Acoustic Modeling In Low Resource Speech Recognition Based On Transfer Learning

Posted on:2020-07-12

Degree:Master

Type:Thesis

Country:China

Candidate:J C Wang

Full Text:PDF

GTID:2428330590454689

Subject:Engineering

Abstract/Summary:

Automatic Speech Recognition(ASR)can convert human speech to words by computer processing,it's a key technology for communication between people and machines.The DNN-HMM(Deep Neural Network Hidden Markov Model)model has become the most popular acoustic model in LVCSR.With the development of technology,the deep neural network ASR system has achieved excellent results close to human speech transliteration ability under the support of big speech data.At present,there are more than 7,000 languages in the world.There are only a few languages with a large amount of voice data,such as English and Mandarin.Most of the other languages only have a small amount of speech data are available for research and speech resource collection is very expensive.However,deep neural network speech recognition systems in low-resource environments often perform poorly.With the development of social,the demand for low-resource speech recognition is increasing.Transfer learning is a method of learning knowledge from one or more similar tasks and using this learned knowledge to quickly build new tasks.In DNN based ASR,each layer's output of DNN acoustic model is a depth representation of speech features,which represents the common acoustic features of human speech.Commonality,it is easy to transfer the acoustic models of other languages by adapt DNN parameters,which makes it possible for low-resource speech recognition to obtain a strong acoustic model through transfer learning.In order to improve the performance of low-resource speech recognition system's DNN based acoustic model,this paper studies several aspects of the transfer learning for acoustic model.The specific contents include: If tansfer the cross-lingual acoustic model has effect;The influence of similarity between languages on the migration of acoustic models;the training method of migration acoustic model;The migration of monophone acoustic model to triphone acoustic model;The Influence of transfer learning performance about languages' similarity;The impact of the amount of training data for the base model on the transfer learning's;Training method for transfer the acoustic model;Can the monophone acoustic model transfer to the triphone acoustic model;Transfer the shared hidden layer model which trained by large multi-lingual data,etc.In this paper,the basic acoustic model is trained in resource-rich Chinese and English,and various experiments are carried out using Uyghur as a low-resource language.The experimental results show that transfer learning can improve the performance of low resource language acoustic models by transfer the basic model.

Keywords/Search Tags:

Automatic Speech Recognition(ASR), Deep Neural Network(DNN), low resource, transfer learning, acoustic model

Related items

1	Research On Uyghur Speech Recognition Based On Deep Learning
2	Research On Transfer Learning For Khalkha Mongolian Speech Recognition Acoustic Model
3	Research On Acoustic Model Of Speech Recognition In Educational Scene Based On Deep Learning
4	Research On Chinese Speech Recognition System Based On Deep Learning
5	Research On Mongolian Speech Recognition Acoustic Model Based On Deep Learning
6	Research On Speech Recognition Based On Convolutional Neural Networks
7	Uyghur Speech Recognition Based On Deep Recurrent Neural Network
8	Acoustic Model Of Speech Recognition Based On Lightweight Neural Network And Its Application In Robot
9	Speech Recognition Front-End Processing Based On Deep Neural Network
10	Deep Neural Networb For Chinese Speech Recognition