Font Size: a A A

Hardware Processor And Corresponding Algorithms For Continuous Queries

Posted on:2007-09-20Degree:DoctorType:Dissertation
Country:ChinaCandidate:J B QianFull Text:PDF
GTID:1118360212465595Subject:Computer application technology
Abstract/Summary:PDF Full Text Request
Recently there has witnessed a growing interest in continuous queries for scenarios in which data streams arrive at very high rates and a DSMS (Data Streams Management System) is registered with many simultaneous queries. Because data streams are rapid, unbounded and have to be processed on line, accelerating data processing is one of the key problems in a DSMS. Many DSMS researchers promote processing speed by query and schedule optimization. It potentially results in saturating the CPU. Loadshedding is a candidate choice when a DSMS is executing aggregate operations and it becomes overload. However, loadshedding can not applied on join operation for it will potentially lost many results. Not limiting in query or schedule optimization, we propose an entire novel framework to accelerate data processing. To our knowledge, there are no correlated literatures. The thesis involves many research areas, such as query processing algorithms, instruction-set design, hardware processor design and compilers. Our contributions are listed below.1. In order to process continuous queries, a novel window join approach named M3 Join (Join over Multiple streams, Multiple queries, and Multiple windows) and its implementation architecture Roujoin are proposed. Unlike many other window join algorithms, M3Join executes window joins in a router-like manner. Roujoin contains a join-routing-table and several join-areas, and is initialized or updated according to those simultaneous queries. Each tuple in the data streams is extended with a route tag. When an original tuple arrives, it is inserted into the corresponding buffer in one of the join-areas. Then it searches the join-routing-table and switches into the right join-area to perform join operations or return to the end users. The generated join tuples whose route tags have been updated iterate the above search and join procedures until there is no join result produced. Other original tuples will be processed in the same way. The approach needs only one scan over the data streams since different join queries share the intermediate results. The experimental results indicate that the approach is feasible and efficient.2. We propose our framework for processing continuous queries. A traditional DBMS processes one-time queries, but a DSMS processes continuous queries. The traditional heuristic of pushing selection predicates below joins is often inappropriate for continuous queries because early selection destroys the ability to share subsequent join processing. Given the high cost of joins, it is possible more efficient that process a join once and then send those results to the selection, Group By and aggregate operators. We evaluate three alternative selection placement strategies, each of which can share join results, for optimizing a large number of continuous queries. In our studies, we find the PullUp strategy (selections are pull above joins) has poor efficiency. Filtered PullUp strategy (tuples are filtered with the union of the selection predicates, then executed PullUp strategy) and Shared PushDown strategy (selections are pushed below joins) have different advantages. Because Filtered PullUp stategy is simple and uses less memory than Shared PushDown, we regard the former strategy as a perfect one.3. Because continuous queries are added or deleted at any moment, the focus of query optimization for a DSMS is to find an algorithm that adapts to frequently add or delete queries. We propose a window joins optimization algorithm to meet the above challenge. For each newly query, we build all probing sequences of each data stream by Minimum Spanning Tree algorithm. If prefixes of the probing sequences are common, we unite the prefixes to reduce the duplicate calculation. The algorithm can find acceptable approximate optimization plans by a little cost. And it need not update...
Keywords/Search Tags:continuous query, window join, query processing, instruction set design, multiple-query optimization, FPGA
PDF Full Text Request
Related items