Font Size: a A A

Anomaly Detection Over Data Streams

Posted on:2007-03-13Degree:DoctorType:Dissertation
Country:ChinaCandidate:S K QinFull Text:PDF
GTID:1118360212484575Subject:Computer software and theory
Abstract/Summary:PDF Full Text Request
Recently, real-time monitoring and online data mining over data streams have been attracting more and more research attention in database research community because of its wide variety of real world applications. For instance, it is very important to monitor the data streams for telecommunication network performance tuning, highway traffic management, trend-related prediction analysis, web-click streams analysis, information system intrusion detection, and sensor networking, etc. Some of these tasks are mission critical which require fast response.Conventional database techniques were initially invented to manage static data collections, which can not be applied directly to monitor or mine dynamic data streams. Hence a robust and efficient anomaly detection is needed to process huge amount of information in a constrained time period, in order to online monitor the data stream. The most important constraints or requirements for the data stream processing include limited memory availability, sequentially linear scanning, and online accurate result report over evolving data streams. However, existing methods on anomaly detection suffer several problems. Our major contributions in this paper are as follows.1. A novel concept of adaptive aggregation burst is introduced for precisely modelling real-life data streams. This burst model empowers users to detect double-side relative bursts dynamically without disturbed by bumps. We propose a framework for accurately detecting such bursts under false positive and false negative constraints. The algorithms are developed based on a space and time efficient histogram, IH, which can be maintained with O(n(logn+logR/log(1+δ))) time and O(logn+logR/log(1+δ)) space. Thus IH is more suitable for burst detection than other popular histograms.2. A monotonic search space based burst detection method is proposed. Firstly, the monotonic search space building algorithms are proposed for transformingreal life data. With the fractal transforming, the miss sorting error errMS can be guaranteed 0. To construct such a monotonic search space for multi-window burst detection, a novel approach is presented, which could decrease the time overhead of query processing from O(m) to O(logm), where m is the number of windows being monitored.3. Propose a parameter-free anomaly detection algorithm in this paper. We first introduce the piecewise fractal model (PFM) for data stream processing, which consumes O(n log n) time and O(logn) space to model a n length stream. Based on this model, a burst detection algorithm is proposed which can provide accurate burst detection. A novel PFM based method is proposed for anomaly detection. This algorithm consumes only O(1) memory and O(n) time without involving training process.This paper give a survey on the state of art anomaly detection over data streams. Based on the analysis of existing research works, a more general definition of anomaly and existing problems are proposed. This paper also propose a system architecture for anomaly detection over data streams which integrates the novel anomaly detection techniques proposed in this paper. Theoretical analysis and experimental results show that this algorithm can achieve higher precision with less space and time complexity as compared with the existing methods, and it could be concluded that the proposed algorithm is suitable for burst detection over data streams.
Keywords/Search Tags:data stream, anomaly detection, fractal, piecewise fractal model, search space
PDF Full Text Request
Related items