Font Size: a A A

System support for conflict resolution in integrating metadata

Posted on:2010-10-09Degree:Ph.DType:Dissertation
University:Arizona State UniversityCandidate:Qi, YanFull Text:PDF
GTID:1442390002477890Subject:Computer Science
Abstract/Summary:
A crucial problem in data integration is that knowledge obtained from different sources may often be conflicting. Conflict resolution, whether performed during the design phase or during run-time, can be costly and, if done without a proper understanding of the usage context, can be ineffective. Metadata integration, as a special and important case of data integration, plays a critical role in many applications, such as data warehousing and the Peer-to-Peer (P2P) system. This dissertation addresses the conflict resolution problem while integrating metadata. Two strategies are often exploited to deal with conflict resolution problems. The first is the pre-cleaning strategy, which removes conflicts while integrating metadata from different data sources, and constructs a consistent integrated metadata but most likely with information loss. The second is the pay-as-you-go strategy, which delays the conflict resolution until after query processing, and might provide context within which conflicts can be eliminated in an informed manner. In the dissertation, both of these strategies are explored. With the pre-cleaning strategy, I first propose an approach to eliminating uncertain information from imperfectly integrated metadata to support Online Analytical Processing (OLAP) operations. Then, I introduce the Feedback-based Inconsistency Resolution (FICSR) system, which exploits a novel exploration and feedback-based approach to conflict resolution when integrating metadata from different sources. Rather than relying on purely automated conflict resolution mechanisms, FICSR brings the domain expert in the conflict resolution process and informs the integration based on the expert's feedback. Furthermore, to improve the query processing in FICSR which helps the user identify the information of interest and highlight related conflicts, a horizon-based ranked join algorithm is suggested for evaluating a top-k twig query on the integrated concept graph with conflicts. Moreover, a table summarization technique is presented to enable an effective navigation over multi-dimensional data with the help of the uncertain integrated metadata. The experiments show that these techniques give significant benefit to conflict resolution in metadata integration.
Keywords/Search Tags:Conflict resolution, Data, Integration, System
Related items