Font Size: a A A

A Research And Application On Entity Resolution

Posted on:2016-02-24Degree:MasterType:Thesis
Country:ChinaCandidate:C ZhuFull Text:PDF
GTID:2308330476953341Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
Traditional Entity Resolution refers to a process in which one or more reference collection which describes the real word is given and all references referred to the same entity are extracted. Entity Resolution, ER, is a key step in Data Cleaning, Data Integration, Data Mining and the insurance of data quality. It has been a while after the research on ER started, with the rapid spread of the Internet and the explosive growth of data scale and it has been a key problem that how to acquire accurate information in huge amount of data, eliminate the ambiguity of similar data and detect the error information in data. Quite a few of research achievements have been applied to several domains including insurance, banking and health care.In this paper, we list and explain some classic algorithms in the development of Entity Resolution, including Pair-wise Entity Resolution, Collective Entity Resolution, Entity Resolution on Big Data et al. We will also introduce the characteristics and limitation of these algorithms and share some state-of-the-art algorithms derived from new application environment according to different requirements.Precise resolution of Web products urgently need to be addressed because of the emerging of E-Commerce. It is a new challenge and also a new chance that the Web data has the characteristics of non-standard and non-structure. We’ll focus on resolution of Web products. Several resolution algorithms will be analyzed, including WHIRL and TMWM, compared and more information will be taken into consideration to form SSM algorithm, which gains a more precise and accurate result. Furthermore, SSM algorithm will be speed up through the global cache of similarities among strings, library of knowledge constraints and blocking strategy..
Keywords/Search Tags:Entity Resolution, Record Linkage, Collective Data, Complex Data, Big Data, Web Product Resolution
PDF Full Text Request
Related items