Font Size: a A A

Modeling and design of entity identity information in entity resolution systems

Posted on:2013-12-07Degree:Ph.DType:Dissertation
University:University of Arkansas at Little RockCandidate:Zhou, YinleFull Text:PDF
GTID:1458390008473918Subject:Engineering
Abstract/Summary:
This dissertation describes and defines a new area of research called entity identity information management (EIIM) and shows that it is an extension and further elaboration of the Stanford Entity Resolution Framework (SERF). EIIM is defined as the collection and management of identity information with the goal of sustaining entity identity integrity, a fundamental data quality requirement for master data. Following the design science research methodology, the dissertation includes a formal mathematical model for describing EIIM. It also includes a discussion of how EIIM design choices affect the implementation of EIIM systems in terms of system efficiency and effectiveness. In addition it describes the validation of the research through demonstration, experimentation, mathematical proofs, and the use of metrics. This dissertation also describes how EIIM design elements have been successfully implemented in the OYSTER open source entity resolution system. It shows how Version 3.2 of OYSTER supports eight EIIM configurations including four types of asserted resolution configurations. Asserted resolution uses external knowledge to override and correct errors introduced by inferred resolution (automated rule-based decisions). Asserted resolution working in concert with inferred resolution increases the capability of EIIM to support all aspects of the identity information life cycle and provides a valuable tool for creating and maintaining persistent entity identifiers. In addition to describing the EIIM configurations implemented in OYSTER, it also provides practical guidance about when and how to use them to successfully manage identity information. Several aspects of this research have already been published including one book chapter, three journal articles, and papers in six conference proceedings.
Keywords/Search Tags:Identity information, EIIM, Resolution
Related items