Font Size: a A A

An Algorithm On Clustering And Anomaly Detection For Multiple Data Streams

Posted on:2012-02-04Degree:MasterType:Thesis
Country:ChinaCandidate:N JiangFull Text:PDF
GTID:2218330368983187Subject:Computer system architecture
Abstract/Summary:PDF Full Text Request
As a new data model,data stream plays an important role in many applications,such as network traffic management,financial montoring,traffic control as well as e-business and so on.The processing and mining technologies over multiple data streams have been widely studied.The infinite and high speed characters of multiple data streams and the requirement of fast on;ine response for these applicationgs break many assumptions in traditional databases. On the other hand, multiple data streams processing technology requires not only focus on change of one data stream, also on relevance analysis between a lot of data streams.Though the reserach of multiple data streams clustering and anomaly detection has been widly studied,but there are still many questions need to be solved.In this thsis,we study the problem of based on clustering multiple data streams anomaly detection,and realize an improved algorithm on clustering and anomaly detection for multiple data streams which gethers clustering algorithm and outlier mining algorithm.It can gain arbitrary clusters,as well as find local outlier,and detect anomaly.First,we analyses and study the related theory of data stream mining.Combined the characteristics of multiple data streams,we reviewed for multiple data streams'research direction and the existing method of anomaly detection,and existing rub and challenges of multiple data streams. On the basis of discrete wavelet transform for multiple data streams compress,we propose an improved algorithm on clustering and anomaly detection for multiple data streams.This algorithm fisrt, preprocesses multiple data streams which gain the compressed data streams,according to multiple data streams'correlation and discrete wavelet transform.lt can reduce the requirements of system's memory storage,quickening computer's deal time.Then we improve similarity matrix which provides completed data and improves the accuration of clustering results.Then we compute the local reach density of each data point and mark core point,and cluster data streams which can find arbitrary clustering shapes.At last,we compute local outlier factor of the nosie,output the set of local outlier factor,detect anomaly by setting the LOF value.In conclusion, experiment shows that the algorithm can cluster and find anomaly of LOF, when computing multiple data streams,and the time of clustering has less then DBSCAN.
Keywords/Search Tags:Multiple Data Streams, Clustering, Discrete Wavelet Transform, Anomaly Detection Local Outlier Factor
PDF Full Text Request
Related items