Font Size: a A A

Robust spoken document retrieval in multilingual and noisy acoustic environments

Posted on:2010-04-28Degree:Ph.DType:Thesis
University:University of Colorado at BoulderCandidate:Akbacak, MuratFull Text:PDF
GTID:2448390002477012Subject:Engineering
Abstract/Summary:
The focus of this thesis is rapidly and effectively increasing the capability and robustness of spoken information search technology in different languages and acoustic conditions. There are two primary thesis contributions that include distinct yet related areas.;The first thesis contribution addresses language leveraging from a speech retrieval point of view. More specifically, language leveraging algorithms are proposed to deploy spoken information search systems in new languages for which training resources are limited. The key idea is to employ knowledge from existing resource-rich language resources and the similarity of these resource-rich languages to a target language to improve spoken information search performance in the target language. Based on this key idea, multilingual parallel and hybrid system combination algorithms are proposed using phonetic lattice-based document and query representations. Experiments in a proper name retrieval task show that retrieval performance degradations (due to data sparseness during automatic speech recognition development in the target language) are compensated for by employing a phonetic recognition system from a resource-rich language. It is shown that the proposed algorithms for developing multilingual spoken information search technology in under-represented languages are able to achieve comparable retrieval performance using less training data. As a side contribution, a similar idea is also employed in a bilingual speaker recognition task where training and test data can be in a person's native language, L1, or in a second language L2. Again, the acoustic similarity between the language pairs are explored to effectively combine individual language-dependent speaker recognition systems in a parallel or hybrid fashion.;The second contribution focuses on the impact of acoustic condition change on retrieval performance in heterogeneous spoken audio collections. Proposed methods towards robust audio indexing and retrieval to reduce the acoustic mismatch employ an Environmental Sniffing module to organize data according to acoustic content, and to capture knowledge to adapt spoken document retrieval to changing acoustic conditions. Based on this key idea, robust parallel or hybrid system combination approaches are investigated using large vocabulary continuous speech recognition (LVCSR) based and sub-word based retrieval systems. Lattice-based vector space retrieval models are implemented using transducer indexes. This adaptive scheme yields significant improvement in terms of retrieval performance over traditional system combination methods.;Collectively, these contributions enable rapid transition of spoken document retrieval to new languages and acoustically heterogeneous audio collections.
Keywords/Search Tags:Spoken, Retrieval, Acoustic, Language, Robust, System combination, Multilingual
Related items