Font Size: a A A

Information Organization And Retrieval For Web Entity And Relation

Posted on:2012-12-11Degree:MasterType:Thesis
Country:ChinaCandidate:S Y GuFull Text:PDF
GTID:2218330362453606Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
Traditional ranking algorithms focus on estimating the importance, authority, relevance among web pages or documents, which belongs to document-level ranking model. However, if a user needs to find an exact web entity or object, he has to browse several web pages from search engine results and filter out the redundant information. To further satisfy user's information needs, Object-Level Ranking is introduced to analyze the entity relation graph by web popularity computing. However, this kind of approaches can only work in specific knowledge domain and cannot search unknown object by merely population computing.In this paper, we propose a novel search system, ClueSearch, which allows users to find his information need by one input clue. ClueSearch (1) organizes the data from Wikipedia and DBpedia to construct the entity-relation graph and use the data from search engine log to extend the graph; (2) indexes the data in an effective way to speed up the searching process base on this graph; (3) designs three novel ranking algorithms, namely BOLR, HistoRank, and PathRank, and together with traditional relevance ranking to help user find their desired entity; (4) employs Natural Language Processing approaches to decompose the clue. The system integrates all of them to let user search object and corresponding relation on entiy-relation graph.The user study shows that ClueSearch provides novel user experience. Furthermore, experimental results on open sources data show that ClueSearch not only break the limitation of traditional object-level ranking algorithm and outperform than baseline approaches.
Keywords/Search Tags:clue search, object-level ranking, entity-relation graph, data indexing
PDF Full Text Request
Related items