Font Size: a A A

Estimating Sliding Window-Based Aggregation Queries Over Probabilistic Data Streams

Posted on:2010-09-01Degree:MasterType:Thesis
Country:ChinaCandidate:Q T WangFull Text:PDF
GTID:2178360275991819Subject:Computer software and theory
Abstract/Summary:PDF Full Text Request
Probabilistic data stream has attracted a lot of attention recently,as the data generated from a wide range of data sources are inherently fuzzy or uncertain.A data stream is a continuous,time-varying,unbounded sequence of data-items.Stream items usually take the form of relational tuples that are disposed after they get processed,implying that online stream algorithms are restricted to only one pass over the data.Traditional data stream is also called as certain data stream.Probabilistic data stream is a generalization and extension of the traditional data stream.In traditional data stream,also called as deterministic data stream,each element is deterministic.While in probabilistic data streams,each element represents a probability distribution over a set of possible events.So this kind of streams can easily handle probabilistic,uncertain and fuzzy data, and are applied in many domains,including data cleaning,information integration and multi-sensor computing.Due to the timeliness and infinity of data stream,we often take interested in the recent data elements,so the research on window-based probabilistic data stream is necessary and meaningful.In this paper,we propose the first algorithms for approximating sliding window-based aggregate queries over probabilistic data stream.These queries include three parts: numerical aggregation,such as SUM,F1(COUNT),F0(DISTINCT COUNT); probabilistic top-k query;finding frequent items;Also,we analyze the time and space complexity of the algorithms and evaluate the benefits of the proposed technique with experiments.
Keywords/Search Tags:probabilistic data stream, aggregate, sliding window
PDF Full Text Request
Related items