Efficient query processing for data integration

Posted on:2003-04-22

Degree:Ph.D

Type:Thesis

University:University of Washington

Candidate:Ives, Zachary George

Full Text:PDF

GTID:2468390011980605

Subject:Computer Science

Abstract/Summary:

A major problem today is that important data is scattered throughout dozens of separately evolved data sources, in a form that makes the “big picture” difficult to obtain. Data integration presents a unified virtual view of all data within a domain, allowing the user to pose queries across the complete integrated schema.; This dissertation addresses the performance needs of real-world business and scientific applications. Standard database techniques for answering queries are inappropriate for data integration, where data sources are autonomous, they generally lack mechanisms for sharing of statistical information about their content, and the environment is shared with other users and subject to unpredictable change. My thesis proposes the use of pipelined and adaptive techniques for processing data integration queries, and I present a unified architecture for adaptive query processing, including novel algorithms and an experimental evaluation. An operator called x-scan extracts the relevant content from an XML source as streams across the network, which enables more work to be done in parallel. Next, the query is answered using algorithms (such as an extended version of the pipelined hash join) whose work is adaptively scheduled, varying to accommodate the relative data arrival rates of the sources. Finally, the system can adapt the ordering of the various operations (the query plan), either at points where the data is being saved to disk or in mid-execution, using a novel technique called convergent query processing. I show that these techniques provide significant benefits in processing data integration queries.

Keywords/Search Tags:

Data, Query processing, Queries

Related items

1	Visual Construction Of Scientific Data Queries And Query Processing Optimization Techniques
2	Efficient structural query processing in XML databases
3	Fast Computation on Processing Data Warehousing Queries on GPU Devices
4	Analytical query processing in data intensive applications
5	Query processing and optimization for structural selection queries over XML data
6	Data analysis and query processing in wireless sensor networks
7	Online query processing in Geographic Information Systems
8	Efficient query processing for data integration
9	Research On Machine Learning Enhanced Query Techniques
10	Adaptive Processing Of Ad-hoc Queries On Data Streams