Font Size: a A A

Verbose Queries Processing Based On Dependency Relationship And Learning To Rank

Posted on:2014-02-22Degree:MasterType:Thesis
Country:ChinaCandidate:L YaoFull Text:PDF
GTID:2248330398450383Subject:Computer application technology
Abstract/Summary:PDF Full Text Request
Most search engines have a tendency to show better retrieval results with keyword queries than with verbose queries. While most queries presented to search engines vary between one to three terms in length, a gradual increase in the average length of queries has been observed. Queries of length five words or more have increased at a year over year rate of10%, while single word queries dropped3%. These longer queries are typically used to convey more sophisticated information needs. Verbose queries tend to contain more redundant terms and these terms have grammatical meaning for communication between humans to help identify the important concepts. Unfortunately, the performance of most commercial and academic search engines deteriorates while handing longer queries.In this paper, we present two techniques for long queries:Verbose queries processing based on dependency relationship; Query term ranking based on semantics and learning to rank.Our first approach is different from the technique which is based on keywords. Features used for query term ranking and selection in previous work do not consider grammatical relationships between terms. To address this issue, we use syntactic features extracted from dependency parsing results of verbose queries. Our approach is focus on the characteristic of long queries. Good selection methods should be able to leverage the grammatical roles of terms within a query.We uses dependence analysis toolkit to analysis the long queries. By using syntactic features, we can take into account grammatical relationships between terms. Experimental results showed that this method improved retrieval performance.Our second approach involves transforming the reduction problem into a problem of learning to rank all sub-sets of the original query (sub-queries) based on some features, and select the top sub-query. In this method, LDA is used to give a latent topics categorization for the text corpus. Parameters are estimated with Gibbs sampling of MCMC. We also provided the evaluation on a collection of long queries. A careful analysis of the results clearly showed that, unlike traditional techniques, our technique improved retrieval performance on difficult long queries.
Keywords/Search Tags:Verbose Query, Query Expansion, Dependent Relationship, Lea rning toRank
PDF Full Text Request
Related items