Font Size: a A A

Research On Mandarin Spoken Document Retrieval Based On Lattice

Posted on:2011-10-22Degree:MasterType:Thesis
Country:ChinaCandidate:Y X GaoFull Text:PDF
GTID:2178330332960484Subject:Signal and Information Processing
Abstract/Summary:PDF Full Text Request
Spoken document retrieval technology can be effective in helping people find relevant information from the flood of information resources. With the advances in speech recognition technology, integrating the information retrieval technology and speech recognition together to realize spoken document retrieval system has become a trend. However, in most cases, because of the mismatch of the model, or the impact of noise, the best results of speech recognition are often unsatisfactory to be used in the spoken document retrieval system.To solve this problem, in this paper, the effects of both retrieval source and retrieval model are considered, combine them effectively to realize a new Mandarin spoken document retrieval method. For the retrieval source, the syllable-lattice providing multiple hypothesis is adopted, which can ameliorate the effect of speech recognition error on information retrieval. At the meanwhile, the syllable-based approach can effectively solve the out-of-vocabulary problem in the query. For the retrieval model, the document length prior is combined with the traditional query likelihood retrieval model.Experimental results show that the retrieval performance of lattice-based method outperforms that of one-best method. Further more, in the retrieval model with the document length prior, lattice-based approach can achieve the best performance, it can improve about 30%. The new method is proved to be correct, feasible and effective by the experiments.
Keywords/Search Tags:spoken document retrieval, syllable-lattice, document priors
PDF Full Text Request
Related items