Font Size: a A A

Research On Adaptive Query Processing Of Data Stream Based On Eddy

Posted on:2013-03-24Degree:MasterType:Thesis
Country:ChinaCandidate:Y G WangFull Text:PDF
GTID:2248330362470874Subject:Computer software and theory
Abstract/Summary:PDF Full Text Request
At present, in the fields of wireless sensor networks, network traffic monitoring, financialapplications, and communications data management and so on, data being processed is no longer thestatic data stored in the storage media, but data streams which real-time, continuouslyarrive. Compared with the traditional static data, data streams have the characteristics of unbounded,continuous arrival and dynamic changes, etc., so the traditional query processing technology of staticdata does not competent the query processing of data streams. Because of the query processing of datastreams is long-term, continuous, and the characteristics of data streams are varied during the query,so how to query the data streams adaptively is a major challenge of data stream management system.Currently, a query processing framework called Eddy is one of the most important researchresults of the adaptive query processing of data streams. Eddy is a route-based adaptive queryprocessing technique, the core of which is the routing policies used to schedule route adaptively. Butfor Eddy the main shortcomings are that the current routing policies based on the assumption thepredicates are independent of each other, but when the predicates are correlated with each other, theroute calculated by the current routing policies may be a bad query plan, resulting in inefficientqueries; and adaptation granularity of the current routing policy with batching is a fixed value K, ifthe value of K is not properly, it will introduce unnecessary overhead and reduce the queryefficiency. In this thesis we have studied and improved these shortcomings of Eddy. The primaryresearch of the thesis as follows:(1) Have a research on the adaptive query processing technology of data streams and point theshortcomings of Eddy;(2) For current routing policies based on the assumption the predicates are independent of eachother can not handle the query processing of data streams in the situation the query predicates arecorrelated well, we advises a routing policy oriented the correlation of the predicates called adaptivegreedy routing policy. This policy takes the correlations among the predicates into consider, so theroute calculated by which is closer to the actual best route in the situation the query predicates arecorrelated. Experiments show that this policy is effective in the situation the query predicates arecorrelated;(3) The adaptation granularity of the routing policy with batching is a fixed value K, if the valueof K chosen is not properly, Maybe Eddy will calculates the same route repeatedly. It will introduce unnecessary overhead and reduce the query efficiency. In this thesis, routing policy with batching hasbeen improved, it can obtain the changed characteristics of the data streams, and adaptively adjust thesize of the adaptation granularity, to ensure that no matter how the characteristics of the data streamschanges, it can provide better query efficiency. Experiments show that the improved routing policy iseffective.
Keywords/Search Tags:data stream, data stream management system, adaptive query processing, routing policy, adaptation granularity
PDF Full Text Request
Related items