Font Size: a A A

Time Series Anomaly Detection Based On Features From Behavior Patterns Of Users

Posted on:2018-03-13Degree:DoctorType:Dissertation
Country:ChinaCandidate:H XiaFull Text:PDF
GTID:1310330533961394Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
The anomalous points are unexpected data generated by users(or sysems),which are not conformed with the regular ones.Research on causes and consequences of anomalies is used for the decision support made by users(or systems).For example,denial of service(DoS)attacks result in massive anomalous traffic on the Internet.The analysis of causes and the prediction on the cripple links help people reschedule the routing strategies,and avoiding the network congestion.With the development of technology,the relationship between data is more complex,and the number of anomaly types is increasing.They limit the performance of the current detection technologies.Hence,one of the most important research topics is to proceed anomaly detection based on the extraction of features.Under the perspective of user behavior patterns,we analysis features of normal data and anomalous data,and proceed anomaly detection in the field of traffic engineering and recommender system respectively,combining signal processing and pattern recognition technologies.The whole work is listed as below:(1)Time series anomaly detection based on normal behavior patternsWe propose an anomaly detection framework BasisEvolution(BE)for network traffic time series,which include preprocessing,basis generation(typical traffic space construction),basis updating(typical traffic space evolution),and anomaly detection phases.1)In data preprocessing process,the sampled data are massive and unlabeled in real applications.To solve this problem,we introduce a data cleaning strategy of the combination of multi detection algorithms inspired by collaborative learning.2)Some ‘variants' or unknown types of anomalies lead either the failure of commonly used anomaly detection algorithms,or high false alarm probabilities.We introduce a basis generation algorithm based on normal behavior patterns of users to address this issue,which construct a normal network traffic space and determine points which cannot project into the space as anomalous points.3)In real world application,the dynamic changing of time series lead the static typical traffic space is not enough for new data.According to the incremental learning,we design a basis updating process,in which the normal behavior patterns are thought to have a slow changing.4)A large amount of anomalous points derived by detecting the massive data require a lot of time and resources in processing.According to time compaction of anomalous points derived by the same reasons,we propose a clustering algorithm and evaluation criteria for this issue.Comparing with other detection methods,Basis Evolution has a higher detection precision,and a lower false alarm probability,is effective in finding ‘variants' or unknown anomalies,can self-adapt to new data,and decrease the scale of anomalous points,which saves a large amount of time and resources.(2)Time series anomaly detection based on trend prediction model.Trend prediction model detect anomalous points by the estimation of future network traffic time series,and comparing the differences between real values and predicted values.As for the prediction problem in Artificial neural networks(ANNs)based prediction model,we introduce a trend prediction based time series anomaly detection model.In this method,according to the normal behavior patterns of users(or networks),we first analysis the periodicity and short-term dependency characteristics in network traffic,and predict the trend containing those characteristics.Then through prediction on traffic components with those characteristics,we can estimate the future network traffic.Comparing with original BackPropagation(BP)model,the proposed method achieves a higher prediction precision,and also a higher detection precision.(3)Time series anomaly detection based on anomalous behavior patterns.In recommender systems,we analysis the characteristics of the anomalous ratings and propose a dynamic time interval segmentation and hypothesis test detection-based framework(SDF).SDF mainly solves the problem of feature extraction and real time anomaly detection.In addition,we introduce a new criteria for algorithm evaluation.1)Anomaly detection model SDF:a.As for the problem in the distinction of the characteristics of anomalous ratings,we analysis the anomalous behavior patterns of malicious users generated by shilling attacks,and extract features of anomalous ratings based on this.b.According to the features extracted above,we propose a dynamic time interval segmentation and hypothesis test detection based framework(SDF).SDF gathers every two neighboring ratings into clusters,and detects anomalous groups based on hypothesis detection,which has accurate online detection results.2)A stability evaluation criteria: In anomaly detection process,results of the same algorithm on the same points in different time scales may be different.We discuss the reasons and consequences of this inconsistency and design a criteria of stability for algorithms.Under the perspective of target items,the characteristics we discussed are generally adapt to the whole shilling attack types.Comparing with other item-based detection methods,SDF has a lower false alarm probability,a lower time complexity.Hence,SDF is designed for real time applications.Comparing the stability of the commonly used algorithms,SDF enjoys the highest stability.
Keywords/Search Tags:Time series, Anomaly detection, Network traffic, Recommender system, Machine Learning
PDF Full Text Request
Related items