Font Size: a A A

Automatic Entity Resolution For The Data Web

Posted on:2011-07-03Degree:MasterType:Thesis
Country:ChinaCandidate:L Y FuFull Text:PDF
GTID:2178360308452444Subject:Computer application technology
Abstract/Summary:PDF Full Text Request
As the data Web is gradually formed, large scale automatic entity resolution becomes a critical task. This task aims at matching entities from different data sources that are with the same meaning, so that programs based on the data can achieve more complete and accurate results when performing operations such as discovery, search, filtering and summarization. It benefits the performance and user experience of a wide range of Web applications such as search, browsing and various mashups.However, in order to perform such data integration, several challenges need to be tackled. First, it requires a scalable solution that can handle the Web of data while still preserving sufficient integration performance in terms of precision and recall. Second, it is hard to evaluate such approaches since currently there are no benchmarks reflecting the variety of real data on the Web.In this paper, I tackle both the challenges by proposing a solution to the problem of entity resolution for the data Web. My solution follows a two-stage process for entity resolution, which first groups entities that are likely to be equivalent during the blocking stage, then within each block, the clustering procedure is performed based on the local structural properties of entities. Experimental results show that our solution scales well and at the same time, achieves effective results in terms of precision and recall.
Keywords/Search Tags:Data Web, Entity Resolution
PDF Full Text Request
Related items