Interactive query processing

Posted on:2002-10-09

Degree:Ph.D

Type:Thesis

University:University of California, Berkeley

Candidate:Raman, Vijayshankar

Full Text:PDF

GTID:2468390011497032

Subject:Computer Science

Abstract/Summary:

Information extraction is increasingly a long and frustrating iterative process, because of large data sizes, increasing data distribution, and the hard-to-automate nature of many processing tasks. This thesis investigates the alleviation of this problem through interactive query processing. Interactivity involves giving users continual feedback during query execution in the form of partial query results, and allowing users to dynamically control the query execution according to their interests in these partial results.; In this thesis, we develop modifications to the standard query processor architecture that enable such user-system interaction. We start by developing a pipelining reorder operator that can be inserted into a standard query plan to make it dynamically tunable during query execution. This operator uses the throughput differences between adjacent query operators to reorder tuples within the query dataflow and prioritize the processing of tuples of interest to the user. We then study the application issues involved in interactive processing, by developing an interactive data cleaning and transformation tool. This tool allows users to explore large datasets on a spreadsheet-like interface, and graphically specify transforms to clean errors in the data format. All operations are performed with instantaneous response times, by focusing work on data that is visible to the user. We then investigate the generation of partial result records, that may not contain all output columns, as a way to improve the system interactivity during query execution. Our focus is on generating these partial results in a fashion that is responsive to both the user's interests in the results and the properties of the data sources involved in the query. A significant hurdle to such partial result generation is the traditional query execution dataflow of optimizer-selected query plans. We develop a more dynamic dataflow scheme that continually adapts two orderings within the query dataflow: the order in which intermediate tuples are routed, and the order in which these tuples flow through query operators. We then refine the granularity of query operators in this architecture, routing tuples not through logical query operators like join operators, but instead through physical operators like query data-structures. This scheme allows the query processor to adapt query execution at a fine granularity, and respond more effectively to changing user interests and data source properties.

Keywords/Search Tags:

Query, Data, Interactive, Processing

Related items

1	Research On Key Technologies Of Distributed Rank-aware Query Processing
2	Research On Interactive Multi-users Skyline Query Processing
3	Research On Key Techniques Of Query Processing Over Wireless Sensor Networks
4	Efficient structural query processing in XML databases
5	Research On Key Techniques Of Query Processing Over Large-scale Graph Data
6	Query Processing And Optimization Over Various Types Of Streaming Data
7	Research On Interactive Aggregate Query Method Of Multidimension Time Series Data
8	Semantic Query Processing Over Linked Data Knowledge Bases
9	Research On Distributed Query Processing And Optimization Of RDF Data
10	Interactive Data Exploration using Gesture