Font Size: a A A

Acoustic Model Expansion Based On Active Learning

Posted on:2015-04-04Degree:MasterType:Thesis
Country:ChinaCandidate:W G PanFull Text:PDF
GTID:2298330422491927Subject:Computer technology
Abstract/Summary:PDF Full Text Request
Speech is one of the most natural and convenient way to exchange information inhuman society everywhere, especially in the rapid development of information andmultimedia of technology today. Speech data in speech communications network andthe Internet is growing with a high speed. Therefore, analyzing and processing thesedata and accessing the information that people are interested in have important usefulvalue. As the core technology of speech information processing, speech recognition is toallow the machine to transform the speech signal into text message becoming more andmore attractive. Let the machine understand human words is a dream of science andtechnology. The technology of speech recognition has a wide range of applications andcommercial value in speech analysis, the field of intelligent control and so on.In recent years, speech recognition has been a hot research field in speechprocessing and attracting a large number of scholars to do deep research. HMM is themain method in research which has made some progress. But the recognizer in the labthat has a good performance perform not so well in the actual environment. The mainreason is that the trained model cannot match the true speech well. HMM-based speechrecognition expresses the acoustic knowledge and linguistic knowledge of training datawith the method of statistical model. The performance depends on the number oftraining data and the width of data covering which is limited to actual environment. It isimpossible to cover all situations pronunciation.Considering the way human study, is learning from wrong knowledge throughcontinuously, active and cumulative learning. Improving and expanding the knowledgein the brain in this way. So we can apply this way to the speech recognition.Transforming the knowledge that is recognized wrong to new models. Through keeplearning like this reduces the model mismatch. To achieve the above purpose we need tosolve the following two problems:(1) the problem of the acoustic models expansion.How to add newly acquired speech knowledge to the existing acoustic models.Considering the existing speech recognition technology is mostly based on statisticsmethods so it is obviously expanding the acoustic model can’t be done directly;(2)Cumulative learning problems is a process of learning from mistakes. So it’s clearlysupposed to be a long-term, continuous and online process and it needs continuousprocess of correction. However, facing the existing speech recognition framework,.How to achieve it is a problem becomes this paper focus on.This paper focusing on these problems carries out a series of studies. We proposestwo methods to expand acoustic model based on primitive homogeneous andheterogeneous and combining these with active learning mechanism to solve cumulativelearning problem of the acoustic model expansion. The main research work and innovation are as follows:(1)Proposing acoustic model expansion method based on MAP. The models thatis added is the same as the original models. The added models which is mixed with theoriginal models are got from original models with MAP method.(2)Proposing primitives heterogeneous acoustic model expansion. To reduce thenumber added to the models. We train the new models in the style of syllable. In orderto balance the phone models and the syllable models. We set some punishment at theentrance of the newly added models in the recognition network.(3)Proposing speech recognition system framework based on active learningmechanism. Applying confidence measure of active learning, we can get the wrongrecognition data with Aposteriori from the structure of Lattice of the middle result ofspeech recognition. We accumulate the data by labeling the data. Expand the number ofmodels with model expansion method constantly to get the aim of calibrating therecognizer when the recognizer is running.
Keywords/Search Tags:Speech recognition, Active Learning, Model extended
PDF Full Text Request
Related items