Font Size: a A A

Research On Active Learning Of ASR Under Low-resource Conditions

Posted on:2018-01-31Degree:MasterType:Thesis
Country:ChinaCandidate:C D XieFull Text:PDF
GTID:2348330512485637Subject:Information and Communication Engineering
Abstract/Summary:PDF Full Text Request
With the development and maturation of automatic speech recognition(ASR)technology of the large resource languages,and the ASR under the low-resource conditions has gradually become an important research hotspot.In this dissertation,two main techniques is carried out to solve the problem of modeling and optimizing of ASR under low-resource conditions by using active learning methods.Firstly,utilizing perplexity criterion to select unlabeled data for training and matching optimization method is adopted during modeling.Secondly,addressing word embedding technology to select web data to enhance the lexicon and expand the language model.At first,an ASR system under the low resource conditions is bulit,in this paper,the acoustic model is trained based on deep neural networks(DNN).In order to get the best triphone states and solve the problem of lack of expert knowledge under the low resource conditions,phonetic questions are automatic generated based on data driven,and are used for tying the states.To address the issue of data deficiencies,the best DNN models of lager resource languages are employed as the initial networks of the objective DNN model.Next,in the process of acoustic modeling,more labeled data are needed to estimate the model parameters,but for low-resource languages(small language),labeled data are very rare.Nowadays,large amounts of cheap unlabeled speech data can easily be obtained via modern equipments,in order to save the manual label cost,we employ perplexity method on abundant unlabeled data selection,and utilizing the selected unlabeled data and the original labeled data together to train the acoustic model.Moreover,the network parameters of acoustic model are adjusted only using the correct labelled data in the last iteration for improving the ASR performance.Finally,a large number of out-of-vocabulary(OOV)appears in the ASR task under low-resource conditions,that is the poor coverage of lexicon,and the text corpus used to generate the language model is relatively small,therefore,it is difficult to obtain a lexicon with better coverage and a very strong langage model.With development of internet,web data can be easily collected from the net,in our work,we use the word embedding method to deal with the collected web data.The useful OOV words from web data are used to expand the lexicon,and getting the informative text data from web data to enhance language model.These steps can finally achieve the purpose of improving the performance of ASR under low-resource conditions.Experiments are conducted on NIST OpenKWS2015 Swahili and NIST OpenKWS2016 Georgian development set,and the results show that our proposed approaches can bring significant performance improvement.
Keywords/Search Tags:Low-resource, Active Learning, Speech Recognition, Deep Neural Network, Web Data
PDF Full Text Request
Related items