Font Size: a A A

Real-Time Anomaly Detection Based Sampled Peculiarity Factor

Posted on:2013-07-30Degree:MasterType:Thesis
Country:ChinaCandidate:S P ShiFull Text:PDF
GTID:2248330371990242Subject:Computer application technology
Abstract/Summary:PDF Full Text Request
With the development of network and information technology, the amount of information grows rapidly in various fields. The data scale becomes huger, at the same time, the demand of real time data mining is increasing. Peculiarity data mining is an important part of data mining and knowledge discovery, and peculiarity data often contains important information including noise, fault, and intrusion etc. With the concern about the problem of fraud detection, network intrusion, fault diagnosis, real time anomaly detection was given increasing attention.There were many anomaly detection technologies, for example based on statistics, clustering, distance and density methods, but these methods have some limitations while higher accuracy and speed were required in some occasions. Thus, sample peculiarity factor was used to integrate distance-based and density-based methods, and the characteristics of data distribution were fully considered to detect anomaly. The experimental results show that the anomaly detection algorithm based on sample peculiarity factor SPF has little precision influence, significantly saves computing time and improves the speed which is suitable for real time anomaly detection. The main work of the paper is as follows:(1) From the point of mathematical statistics, analyzing the anomaly detection algorithm provides accuracy guarantees for sampling methods. Sampling algorithm and the traditional distance-based K-NN algorithm were combined for quality metrics by analyzing the expectation and variance of the sampling algorithm returns outliers, and the distance database was constructed to approximate calculation of the overall situation in which the sample variance was used to estimate the population variance. A theoretical basis for anomaly detection based on sample peculiarity factor was provided.(2) A learning algorithm of the optimal sampling frequency was proposed. The sampling method was selected, then using binary learning to gain sampling frequency. Under the premise of a given degree of confidence, the confidence interval of the sampling frequency was obtained which was considered as the optimal sampling frequency range. According to sampling frequency, sample subset was obtained for anomaly detection. The experiments show that, while the sampling frequency was between1/32and1/16, the anomaly detection algorithm based on sample peculiarity factor and anomaly detection algorithm based on peculiarity factor and local peculiarity factor were compared, the former obviously improved the speed when the accuracy has little effect.(3) The sample peculiarity factor and the sampling frequency were used to detect real-time anomaly. Firstly, the original data set was divided into normal dataset and anomalous dataset. Then learning the optimal sampling frequency in the normal dataset, and sampled subset was obtained. In real-time processing, only calculate the SPF value of the current data using the rank comparison methods to determine whether it was abnormal or not. Simulation results show that false detection rate of the algorithm was2%.
Keywords/Search Tags:sample peculiarity factor, real-time, sample frequency, anomalydetection
PDF Full Text Request
Related items