Font Size: a A A

Key Issues Of Spoken Document Retrieval Based On Syllable-Fragment Lattice

Posted on:2013-07-26Degree:MasterType:Thesis
Country:ChinaCandidate:G X ChenFull Text:PDF
GTID:2248330377458928Subject:Signal and Information Processing
Abstract/Summary:PDF Full Text Request
With the rapid development of Internet and information technology, there are more andmore multimedia information resource, information retrieval and classification for multimediais becoming more important. As speech data became more and more in radio, meeting and theInternet, how to search out the information we need in a flood of voice library has become akey problems.Spoken document retrieval is a process which is based on user’s input query, searchingfiles in speech resource and returning spoken documents that are related to the query request.Usually it consists of two stages: offline indexing stage and online retrieval stage. Offlineindexing stage includes two modules: speech recognition and indexing building. Theperformance of online retrieval is closely related to these two modules.Speech recognition results has three forms, One-best, N-best and Lattice. Lattice is aform of directed acyclic graph and contains more candidate results, which can compensate forthe impact of speech recognition errors and improve the retrieval performance. ThereforeLattice is widely used in speech recognition by researchers, and Lattice-based spokendocuments retrieval has become the mainstream. In Chinese speech recognition, comparedwith Chinese characters, words and sentences, syllables is used as basic unit because of itslimited number and rich expression in content, moreover, syllables can effectively solve theOOV(out-of-vocabulary) problem.In spoken document retrieval system based on syllable Lattice, because there is too muchredundant information and complex structure in Lattice, this research studies generatingconfusion network from Lattice. Confusion network is a more concise and efficient networkwhich is close to the linear structure and contains abundant information and is easilyprocessed. Compared with Lattice, confusion network index occupies smaller space and ismore suitable for subsequent retrieval. In order to improve the retrieval accuracy, this articlefiltered out the higher syllable word frequency combination as a word-fragment to generatethe language model consists of syllable and word-fragments, generated Lattice with syllableand word mixed, and then transformed them into confusion network, which can improve therecognition rate. Traditional vector space model (VSM) has irrationality when it is applied to themulti-candidate results based spoken document retrieval system. This research changed theweight calculation method, to make it more suitable for retrieval based on confusion network.The experiments show that the introduction of the word-fragment greatly improved therecognition rate of Lattice and confusion network, and confusion network index form becamemore concise and efficient compared with Lattice. Compared with base system, the accuracyand result ranking of confusion network-based spoken document retrieval system has beengreatly improved.
Keywords/Search Tags:Spoken Document Retrieval, Lattice, Confusion Network, Word Fragment
PDF Full Text Request
Related items