Font Size: a A A

Research And Implementation Of Log Anomaly Detection Tool Based On Long Short-term Memory Network

Posted on:2021-01-24Degree:MasterType:Thesis
Country:ChinaCandidate:J W TangFull Text:PDF
GTID:2518306461969119Subject:Master of Engineering
Abstract/Summary:PDF Full Text Request
Large-scale distributed systems are becoming a core component of the IT industry,supporting various types of software,including online banking,e-commerce,instant messaging.Compared with traditional stand-alone systems,most of these distributed systems operate around the clock,providing necessary services to millions of users around the world.Any abnormal shutdown of this type of system may result in significant revenue loss.At the same time,with the continuous update of network attack methods,system is under grave danger when it's operating.This highlights the necessity of network defense and the reliability of the protection system.Anomaly detection is a key link in network attack defense.After development,anomaly detection algorithms have evolved from early detection methods based on statistical ideas to detection methods using machine learning.As the scale of data continues to increase to the gigabyte level,traditional detection methods can hardly find outliers from large-scale data sets.In recent years,anomaly detection algorithms based on deep learning have become a research hotspot,and have proven to be more suitable for finding anomalies from large-scale data than traditional machine learning methods.The system log can be used to record the data of system runtime information in detail,so it plays an important role in anomaly detection.The original log is often an unstructured record,cannot be directly used for analysis to find anomalies.The original log message needs to be analyzed and converted into a series of structured events for anomaly detection.The performance of existing log parsing tools is still insufficient when processing a large amount of data.This paper intends to use the large-scale data processing capacity of spark streaming framework to improve the performance of the parsing tools and the efficiency of log parsing.In this paper,we will systematically describe the technology of anomaly detection based on log.Based on this,we design and implement the analysis of real-time input log data to detect the anomalies in the log data.The research contents and achievements of this paper mainly cover the following aspects.(1)Use the marked log data set to train the long and short-term memory network(LSTM)to learn the normal log mode.(2)Implement the log analysis tool with the Spark-Streaming framework to enhance the processing efficiency of the log analysis tool for large-scale data.(3)Use the trained LSTM as the detection module and the distributed analysis tool as the analysis module to develop the prototype of the log analysis tool,and deploy it on the Hadoop platform to realize the analysis of real-time input log data and anomaly detection.This paper uses open source log data sets to verify the performance of the LSTM network and analysis tools.The LSTM network has verified through experiments that LSTM can detect different types of anomalies on experimental data,Choose the anomaly detection model based on supervised learning support vector machine(SVM)and decision tree to compare with the LSTM model(referred to as LSTM-log in this article),and verify the LSTM-log model in terms of anomaly detection perform better than SVM and decision tree models in Precision,Recall,F-measure.It is a log analysis tool.The realization of anomaly detection module provides the basis.In terms of analysis tools,comparative experiments have proved that the distributed analysis tools on the same data set have the same accuracy as the analysis tools of the stand-alone environment and are superior to the analysis tools of the stand-alone environment in terms of analysis efficiency.Based on the above experimental results,we developed and deployed a prototype analysis tool,which can analyze and detect incoming log records in real time.
Keywords/Search Tags:anomaly detection, log analysis, big data processing
PDF Full Text Request
Related items