Font Size: a A A

Speech Recognition Framework Based On Sphinx And Its Performance Opitmization

Posted on:2015-05-13Degree:MasterType:Thesis
Country:ChinaCandidate:Y B TanFull Text:PDF
GTID:2298330467488815Subject:Computer technology
Abstract/Summary:PDF Full Text Request
Speech recognition is a human-computer interaction technology. It bases on the noisy channelmodel and has changed the communication between human and machine. For now it has beenwidely used by all kinds of devices, and is affecting the people life-style. In the speech recognitionarea, the open-source speech recognition tools play a prominent role for speech recognitiontechnology popularizing. This paper just based on the Carnegie Mellon University Sphinx open-source speech recognition tool and deeply analyzed its framework. Then this paper proposed andachieved a new parallel confusion network formation algorithm and a new re-scoring algorithmbasedon genetic algorithm. So themainjob couldbesummarized to belowthreesteps:(1) Analysis of the Sphinx-4each functional module, Instance of its decoding module, frontend module, linguist module etc, emphatically analyzed its decoding module, to lay a solidfoundationforunderstanding speechrecognition technology.(2) Analyzed the confusion network formation algorithm for achieving the minimization worderrors in the Sphinx-4, proposed a new algorithm for generating the confusion network. Thisalgorithm adopted paralleltechnology, so it can significantly reduce theneed fortime consumption,and maintain the confusion network quality. And it used tree structure and node clustering methodfor segmenting the lattice nodes, then parallel producing the confusion network. The experiment onthesphinx-4showed thenewalgorithmiseffective.(3) Studied the already existed rescoring algorithms, and then realized a new rescoringalgorithm on the sphinx-4, which used the genetic algorithm for re-scoring.Through converting theoutput of the speech recognize engine into a confusion network, And then structured thechromosome that the genetic algorithm need, after then rescoring the chromosome. Meanwhile thispaper analyzed the interactive relation between the change of the mutation rate, crossover rate, therescoring function and the word recognition rate. Experiments on sphinx-4proved: This algorithmcanbecompared withtheiterativere-scoring algorithmcanget betterwordrecognition rate.
Keywords/Search Tags:Sphinx-4speech recognition engine, confusion network, lattice rescoringalgorithm, parallelalgorithm, geneticalgorithm
PDF Full Text Request
Related items