Adaptive query processing in data stream management systems

Posted on:2006-01-18

Degree:Ph.D

Type:Thesis

University:Stanford University

Candidate:Babu, Shivnath

Full Text:PDF

GTID:2458390008970305

Subject:Computer Science

Abstract/Summary:

Many modern applications need to process data streams that consist of data elements generated in a continuous unbounded fashion. Examples include network monitoring, financial monitoring over stock tickers, sensor processing for environmental monitoring or inventory tracking, telecommunications fraud detection, and others. These applications have spurred interest in a new class of systems, called Data Stream Management Systems (DSMSs), that enable applications to pose long-running continuous queries over data streams.; A fundamental challenge faced by DSMSs is that stream conditions (e.g., data distribution, arrival rate) and system conditions (e.g., query load, memory availability) may vary significantly over the lifetime of a continuous query. When stream or system conditions change, a query execution strategy that was efficient before the change may become very inefficient. Consequently, it is important for a DSMS to support adaptive query processing: The DSMS must be prepared to change the execution plan for a continuous query while the query is running, based on how stream and system conditions change. Without adaptivity, plan performance may drop drastically over time.; This thesis presents a generic framework, called StreaMon, for adaptive query processing in a DSMS. StreaMon has three core components: (i) An Executor, which runs the current plan for each query, (ii) a Profiler, which collects and maintains statistics about current stream and system conditions, and (iii) a Re-optimizer, which ensures that the current plans are the most efficient for current conditions. We instantiate the generic StreaMon framework for three distinct combinations of continuous query type and adaptivity need: (1) Adaptive processing of commutative filters over a stream to maximize throughput at all points in time. (2) Adaptive placement of subresult caches in pipelined plans for windowed stream joins to maximize throughput at all points in time. (3) Detecting relaxed constraints automatically in input streams and exploiting these constraints to reduce memory requirements in plans for windowed stream joins. For each problem, we provide the definition and motivating examples, develop and analyze adaptive algorithms, and present implementation techniques and experimental results from the STREAM general-purpose DSMS prototype developed at Stanford.

Keywords/Search Tags:

STREAM, Adaptive, Data, DSMS, System, Continuous

Related items

1	Research And Design On The Data Stream Management System
2	Research And Design On The Data Stream System
3	Adaptive Distributed Data Stream Management System
4	Research On An Application Of Data Stream Query And Data Stream Mining In Oil Field
5	The Data Stream Adaptive Query Processing Techniques
6	A Study Of Data Mining Based On Data Stream Management System
7	A framework for supporting quality of service requirements in a data stream management system
8	Design And Implementation Of Continuous Query Processing Prototype System Over Data Streams
9	Research On The Technology Of Continuous Query Processing Over Data Stream
10	Research And Implementation Of Mechanism Of Query Processing In Data Stream System