Font Size: a A A

Efficient Index Structure For Advertisement Search

Posted on:2011-02-28Degree:MasterType:Thesis
Country:ChinaCandidate:Y LiangFull Text:PDF
GTID:2178360308452445Subject:Computer application technology
Abstract/Summary:PDF Full Text Request
Web advertising is playing an important role among advertising channels. It represents a growing part of the revenues of major Internet service providers such as Google and Yahoo. A commonly used strategy is to place advertisements (ads) on the search result pages according to the users'submitted queries, which is called Sponsored Search. Relevant ads are likely to be clicked by a user and thus to increase the revenues of both advertisers and publishers.However, data in Web advertising is different comparing to other textual data. Bid phrases defined by ad-owners are usually contained in limited number of ads. Directly matching user queries with bid phrases often results in returning few appropriate ads. Thus methods in traditional Information Retrieval for textual data cannot be directly used. To address this shortcoming, query expansion is often used to increase the chances to match the ads. Nevertheless, query expansion on top of the traditional inverted index faces efficiency issues such as high time complexity and heavy I/O costs. Moreover, precision cannot always be improved, sometimes even hurt due to the involvement of additional noise.In this paper, an efficient ad search solution relying on a block-based index is proposed to tackle the issues associated with query expansion. The block-based index structure places clusters of similar bid phrases in corresponding blocks with their associated ads. It reduces the number of merge operations significantly during query expansion and allows sequential scans rather than random accesses, saving I/O costs. It adopts flexible block sizes according to the clustering results of bid phrases to further optimize the index structure for efficient ad search. The pre-computation of such clusters is achieved through an agglomerative iterative clustering algorithm. Finally, a spreading activation ranking mechanism is proposed to return the top-k relevant ads, improving search precision.Comparing to other Sponsored Search solutions, the block-based index combines clustering of bid phrases with the storage of ads data. The experimental results show that it can indeed return a larger number of relevant ads without sacrificing execution speed.
Keywords/Search Tags:Advertisement Search, Block-based Index, Clustering
PDF Full Text Request
Related items