Font Size: a A A

Research Of Long Speech And Text Alignment

Posted on:2014-01-16Degree:MasterType:Thesis
Country:ChinaCandidate:C ZhouFull Text:PDF
GTID:2268330401990288Subject:Computational Mathematics
Abstract/Summary:PDF Full Text Request
Long audio and text alignment can promote large-scale study of the rich spo-ken language resources, e.g., collections of audio books, or multimedia documents.For such voice resources, Based on the conventional Viterbi forced alignment algo-rithm may often be proven inadequate mainly due to first the transcriotion mustbe accuate and second the audio has to be relatively nosie-free.Due to the rate ofSpeech Recongnition has promoted significantly in recent years,the condition of us-ing speech recognition engine to solve the voice text alignment is ripe. In this paper,we present a C++language program for robust long speech-text alignment that cir-cumvents these restrictions. It implements an adaptive, iterative speech recognitionand text alignment scheme that allows for the processing of very long (and possiblynoisy) audio and is robust to audio with white noise. This program is evaluatedon artificially created long chunks of the TIMIT database and863speech database.Audio is artificially contaminated with babble noise, We present the correspondingword boundary detection results.
Keywords/Search Tags:HTK, Robust, Speech recongnition, Long speech-text alignment, Adaptation, Edit distance
PDF Full Text Request
Related items