Font Size: a A A

Named Entity Mining Based On Click-Through Data And Search-Result Snippet

Posted on:2012-03-04Degree:MasterType:Thesis
Country:ChinaCandidate:J W DuFull Text:PDF
GTID:2218330362453602Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
This paper addresses the problem of Named Entity Mining (NEM) from Click-Through data collected by search engine. Named Entity Mining is one of the most important web search technologies which has been used in many applications, such as online advertisement, user behavior analysis and recommendation system. However, the lack of context information in query logs or click-through data for rare entities, which in aggregation account for a considerable fraction of search engine queries which contain entities, makes some classical NEM algorithms fail. In this paper I propose to utilize the search-result snippets retrieved by the entity (as a query) as its context to address this limitation. Given the predefined classes and seed named entities, I use a template matching method to mine Named Entities from click-through data and give their classes, where search-result snippets are used. During the mining procedure, I use WS-LDA (Weakly Supervised Latent Dirichlet Allocation) to learn topic models to resolve ambiguities of named entity classes. I use the search-result snippets corresponding to the seed named entities to train a query classifier, and use the query classifier to optimize the mining results. Experimental results on a large scale real world click-through data show that the proposed approach performs significantly better than the baseline.
Keywords/Search Tags:Named Entity Mining, Search-Result Snippet, Click-Through Data, Topic Model, Named Entity Recognition
PDF Full Text Request
Related items