Font Size: a A A

Resolving ambiguity for cross-language information retrieval: A dictionary approach

Posted on:2002-02-21Degree:Ph.DType:Dissertation
University:University of Massachusetts AmherstCandidate:Ballesteros, Lisa AnnFull Text:PDF
GTID:1468390011497748Subject:Computer Science
Abstract/Summary:
The global exchange of information has been facilitated by the rapid expansion in the size and use of the Internet, which has led to a large increase in the availability of on-line texts. Expanded international collaboration, the increase in the availability of electronic foreign language texts, the growing number of non-English-speaking users, and the lack of a common language of discourse compels us to develop cross-language information retrieval (CLIR) tools capable of bridging the language barrier. Cross-language retrieval bridges this gap by enabling a person to search in one language and retrieve documents across languages.; There are several goals for the research described herein. The first is to gain a clear understanding of the problems associated with the cross-language task and to develop techniques for addressing them. Empirical work shows that ambiguity and lack of lexical resources are the main hurdles. Second we show that cross-language effectiveness does not depend upon linguistic analysis. We demonstrate how statistical techniques can be used to significantly reduce the effects of ambiguity. We also show that combining these techniques is as effective as or more effective than a reasonable machine translation system. Third, we show that an approach based on multi-lingual dictionaries and statistical analysis can be used as the foundation for a cross-language retrieval architecture that circumvents the problem of limited resources.
Keywords/Search Tags:Cross-language, Retrieval, Information, Ambiguity
Related items