Font Size: a A A

Caching And Query Pruning Strategy For Efficient Search Engine Query Execution

Posted on:2007-06-16Degree:MasterType:Thesis
Country:ChinaCandidate:C XieFull Text:PDF
GTID:2178360182993757Subject:Computer application technology
Abstract/Summary:PDF Full Text Request
The rapid explosion of web pages in Internet requires search engine to respond thousands of queries per second. How to enhance the speed of search engine without the degradation of query quality becomes a hot in information retrieval researches. During the query process, most time are consumed to scan the huge inverted index. Hence, optimization of inverted index access is the key to enhance search engine performance. Recently, many approaches have been proposed, including the techniques of query pruning and cache.Query pruning is used for avoiding the whole access of the posting list, and buffering is to reduce the I/O cost.Most evaluation function of existing techniques only consider the factor of query terms or the analysis of the page, but ignore an important factor of the query.In this article, the author proposed to establish a big buffer. It resides both in memory and disk, in which the intersection of high frequency term pairs' posting list is stored. This article also proposed an approach of query pruning. It can accelerate the execution of multi-terms query by the intersection, which can be grouped by the query-based evaluation function.To make an integration of the proposed method, this article proposed a new index-architecture, which can harness the resources in index system efficiently and enhance the effect of query pruning dramatically.The experiment shows that the new proposed approach can bring up the performance of index sub-system dramatically, and in the mean time keep the top-k results well.
Keywords/Search Tags:search engine, evaluation function, query pruning, PageRank, inverted index, buffer management, phrase querying
PDF Full Text Request
Related items