Font Size: a A A

Source discovery and schema mapping for data integration

Posted on:2004-05-17Degree:Ph.DType:Dissertation
University:Brigham Young UniversityCandidate:Xu, LiFull Text:PDF
GTID:1468390011473969Subject:Computer Science
Abstract/Summary:
As data explodes on the Web, there is a need to integrate data from a large number of heterogeneous information sources. Currently, there are two main basic approaches to data integration: Global-as-View (GAV) and Local-as-View (LAV). However, both approaches have their limitations for large-scale applications. To resolve the problems, we offer a Target-based Integration Query System (TIQS) as an alternative point of view that is neither GAV nor LAV The approach uses a predefined conceptual target schema, which is specified ontologically and independently of any of the sources, as a central, organizing concept. In this dissertation, we focus on the resolutions to three problems in TIQS: (1) automatically recognizing information sources for the target, (2) automating source-to-target mappings between source and target schemas, and (3) query reformulation based on source-to-target mappings. Experiments we have conducted show that we have been able to achieve good performance for the recognition of applicable documents as well as the generation of source-to-target mappings. Moreover, we have proven that query reformulation in TIQS reduces to rule unfolding and the reformulated user queries extract all the query answers available from sources with respect to the definition of TIQS for the proposed queries.
Keywords/Search Tags:Data, TIQS, Sources, Query
Related items