Exploiting Common Search Interests across Languages for Web Search

Posted on:2011-05-13

Degree:Ph.D

Type:Dissertation

University:The Chinese University of Hong Kong (Hong Kong)

Candidate:Gao, Wei

Full Text:PDF

GTID:1448390002958465

Subject:Information Science

Abstract/Summary:

This work studies something new in Web search to cater for users' cross-lingual information needs by using the common search interests found across different languages. We assume a generic scenario for monolingual users who are interested to find their relevant information under three general settings: (1) find relevant information in a foreign language, which needs machine to translate search results into the user's own language; (2) find relevant information in multiple languages including the source language, which also requires machine translation for back translating search results; (3) find relevant information only in the user's language, but due to the intrinsic cross-lingual nature of many queries, monolingual search can be done with the assistance of cross-lingual information from another language.;We approach the problem by substantially extending two core mechanics of information retrieval for Web search across languages, namely, query formulation and relevance ranking. First, unlike traditional cross-lingual methods such as query translation and expansion, we propose a novel Cross-Lingual Query Suggestion model by leveraging large-scale query logs of search engine to learn to suggest closely related queries in the target language for a given source language query. The rationale behind our approach is the ever-increasing common search interests across Web users in different languages. Second, we generalize the usefulness of common search interests to enhance relevance ranking of documents by exploiting the correlation among the search results derived from bilingual queries, and overcome the weakness of traditional relevance estimation that only uses information of a single language or that of different languages separately. To this end, we attempt to learn a ranking function that incorporates various similarity measures among the retrieved documents in different languages. By modeling the commonality or similarity of search results, relevant documents in one language may help the relevance estimation of documents in a different language, and hence can improve the overall relevance estimation. This similar intuition is applicable to all the three settings described above.

Keywords/Search Tags:

Search, Language, Web, Information, Relevance estimation, Across, Cross-lingual

Related items

1	Cross Language Information Retrieval Based On Topical Pseudo Relevance Feedback
2	Research On Enhancing Natural Language Inference Through Knowledge Graph Embedding And Cross-lingual Transfer
3	Research On Cross-lingual Spoken Language Understanding
4	Research On Cross-lingual Word Similarity Computation
5	Research On Machine Reading Comprehension Model Based On Cross-lingual Transfer Technology
6	The Key Technologies Of Cross-lingual Aspect Sentiment Classification Towards E-commerce Reviews
7	A Method For Obtaining And Analyzing Cross-language News Event Information In Non-ferrous Industries
8	Cross-Lingual Word Sense Disambiguation for Languages with Scarce Resources
9	Research On Approaches For Cross-lingual Sentiment Analysis
10	Research On The Application Of Machine Translation In Cross-lingual Document Classification