Font Size: a A A

Application Research Of Outlier Anomaly Detection Technology For Time Series Data

Posted on:2020-03-29Degree:MasterType:Thesis
Country:ChinaCandidate:L LiuFull Text:PDF
GTID:2438330575996414Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
With the rapid development of information technology,server operation and maintenance costs are getting higher and higher,and one of the core problems of server operation and maintenance is to pay attention to whether the indicators of the server are abnormal.The server's metric monitoring data is typical time series data.These indicators indicate whether the application on the server and the server are running well.Therefore,the abnormality detection in the server operation and maintenance field is of great significance.At present,there are many mature studies on univariate time series data anomaly detection,but there are still many problems for multidimensional time series data anomaly detection.Especially in the field of server operation and maintenance,server monitoring data has a wide variety,large data size,fast flow rate,requiring algorithms to quickly detect anomalies.In order to solve these problems,this paper has done the following research work:(1)There are many differences between anomaly detection of multidimensional time series data and univariate time series data.For example,the time continuity of a multidimensional time series is much weaker than a univariate time series,and it is unreasonable to judge the entire multidimensional data as an anomaly if there is only one-dimension anomaly in the multidimensional data.Aiming at this problem,this paper proposes a collective anomaly detection algorithm iForest for multidimensional time-series(iForestFMT)for multi-dimensional time series,which mainly detects whether a certain sub-sequence or a certain time period of data in multi-dimensional time series is abnormal.Considering the key role of time continuity in the anomaly detection of time series,the iForestFMT algorithm makes full use of the statistical characteristics of time series,and improves the isolated forest algorithm with excellent performance in the field of anomaly detection.Experiments were conducted in three real data sets,and the results proved that the proposed method is efficient.(2)In order to detect whether the multiple server monitoring data in a certain time period is abnormal in real time,this paper improves the iForestFMT algorithm to adapt to the collective anomaly detection of streaming data by adding mass information in the subspace,and proposes the iForest for multidimensional streams(iForestForStreams)algorithm.The experimental results show that the iForestForStreams algorithm has relatively stable space-time complexity in the processing of streaming data,and can process high-speed stream data in a timely manner.In addition,in order to improve the time efficiency,stability and scalability of iForestForStreams,this paper redesigns and implements the distributed version of the algorithm under the mature distributed computing framework.Experiments show that the distributed version of the iForestForStreams algorithm is 2.7 times more efficient than the former,and exhibits good scalability and stability.(3)This paper designs a multi-functional time-series anomaly detection system ADSO in the field of server operation and maintenance.The system implements four functional modules:data acquisition,feature calculation,model training and abnormal alarm.Then simulate the server operation and maintenance scenario to verify the effectiveness of the system.In summary,the proposed algorithm has certain application significance for anomaly detection in the field of server operation and maintenance.The three data sets used in this experiment are from the UCI library and are real server network data.The above experiments verify that the algorithm in this paper is helpful to discover the intrusion behavior in network data.By applying the algorithm proposed in this paper to different kinds of server data,it is helpful to discover different types of server anomaly.
Keywords/Search Tags:Time Series, Collective Anomaly Detection, Isolation Forest, Streaming Processing, Distributed Computing
PDF Full Text Request
Related items