Font Size: a A A

Study And Application Of Data Stream Processing System Over High-speed Networks

Posted on:2008-10-25Degree:MasterType:Thesis
Country:ChinaCandidate:L S ChenFull Text:PDF
GTID:2178360242979483Subject:Control Engineering
Abstract/Summary:PDF Full Text Request
Traditional database management systems are best equipped to run one-time queries over finite stored data sets. However, many modern applications such as network monitoring, financial analysis, manufacturing, and sensor networks require continuous queries over continuous unbounded streams of data. In these applications, data does not take the form of finite stored data sets, but rather arrives in multiple, continuous, rapid, time-varying data streams. Applications that require real-time processing of high-volume data streams are pushing the limits of traditional data processing infrastructures. It is not feasible to simply load arriving data into a traditional database management system and operate on it there. Traditional database management systems are not directly support continuous queries that are typical of data stream applications. Furthermore, both approximation and adaptivity are key ingredients in data stream processing, while traditional DBMS focus largely on the opposite goal of precise answers computed by stable query plans. So, the stream-oriented applications tend to use a DBMS largely as an offline storage system.Based on control theory and appropriate scheduling strategy, our aim is to propose a prototype of data stream processing with an adaptive approach to the arrival characteristics of data streams and the fluctuating network environment. Firstly, based on the data stream characteristics and traditional DBMS, six aspects of data processing are presented to characterize the requirements for real-time query processing. Then, a subsection scheduling strategy, with the goal of minimizing memory requirements and low output latency, is used in continuous queries model for data streams. The main modification that have made to standard SQL, in addition to allowing the FROM clause to refer to streams as well as relations, is to extend the expressiveness of the query language for sliding windows. Finally, base on the control theory, the architecture of data stream processing is designed specifically to address the issues above. Many additional problems are still under investigation, further directions are discussed in the conclusion part of the paper.
Keywords/Search Tags:High-speed Network, data stream, query processing
PDF Full Text Request
Related items