Font Size: a A A

A Mediator-based Data Integration System for Query Answering using an Optimized Extended Inverse Rules Algorithm

Posted on:2011-08-08Degree:M.A.ScType:Thesis
University:Carleton University (Canada)Candidate:Jayaraman, GayathriFull Text:PDF
GTID:2448390002958839Subject:Computer Science
Abstract/Summary:
A mediator system allows users to pose queries against a global schema and returns answers from multiple data sources. The rewriting of the user query in terms of the local sources uses mappings, which in the Local-As-View (LAV) approach, describe the source relations as views over the global schema. Among the existing algorithms that perform query rewriting in LAV, the Extended Inverse Rules Algorithm (EIRA) provides the most general approach. Given a set of mappings and database facts, EIRA provides a logic program, which specifies a class of legal instances of the global system. The specification of the legal instances can be used to compute certain answers for user queries that are monotone.;In this thesis, we describe the design, representation and implementation of a mediator system, called Virtual Integration Support System (VISS), that uses an optimized EIRA for query answering. We describe a general framework for metadata representation in a virtual and relational data integration system under the LAV approach. Specifically, we use XML and RuleML for representing metadata, viz. the global and local schemas, the mappings between the former and the latter, and global integrity constraints.;We also show how to obtain a reduced set of mappings and a subset of available sources for a user query. Using this, we optimize the logic program by generating only the required parts i.e., those that can be used for answering the query) of the specification program in EIRA. We also import only the relevant facts using the reduced list of sources for computing the answers.;We describe how XQuery can be used to retrieve the relevant information for EIRA based on our optimized approach. The information is then used to build the logic program specification for computing certain answers. The implementation of VISS uses open-source tools and is used to compute certain answers to Datalog queries, which are monotone.;However, the output of EIRA is only a program specification. Therefore, applying it in a data integration system for query answering requires the design of a system that can store, specify and query the metadata representation. Moreover, it is inefficient to consider all the available mappings and use the facts from all the sources for computing answers to the user query.
Keywords/Search Tags:Query, System, Answers, Sources, User, EIRA, Mappings, Global
Related items