Font Size: a A A

Research And Implementation Of A LSTM-based Aanomaly Detection Method For Software Systems

Posted on:2020-05-22Degree:MasterType:Thesis
Country:ChinaCandidate:P Y LuFull Text:PDF
GTID:2428330602450194Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
With the rapid development of the computer industry,computer software has been widely used in all aspects of society.In order to solve the increasingly complex software requirements,the scale of software is becoming larger and larger.In the face of increasing software reliability requirements,anomaly detection has become more and more important as a key step in software error discovery,abnormal recovery and anomaly cause analysis.Software anomaly detection is the process of detecting behavior that does not meet the developer's expectations during the operation of the software.Log data,as the most important resource for developers to understand and analyze the running state of the system,is an essential basic content of software anomaly detection.Software anomaly detection based on log data has become an essential and critical step in the current software development process.At the same time,log-based anomaly detection is still faced with many difficulties,such as,unstructured logs being difficult to parse,the real-time requirements being high to detect exception,the content utilization of log messages being low,and the context of logs being highly correlated,all of which make log-based software anomaly detection not easy.Aiming at the above four problems,this paper presents a software anomaly detection method based on the long short term Memory networks(LSTM).The specific research content includes the following aspects:(1)For the problem of log parsing,this paper designs a general real-time structured parsing method,using the idea of the longest common subsequence(LCS)to parse the text-formatted log message into a structured,computer-can processed data content in real time.At the same time,the calculation process of LCS is optimized based on the prefix tree,and the time complexity of log message resolution is reduced from O(7)m*n~2(8)to O(7)n(8).(2)In view of the contextual correlation of log content and the non-mundane and non-stylized features of exceptions,this paper uses LSTM model to extract the behavior pattern of normal log sequences and predict the new log messages through w,the context of log messages,which means this paper uses LSTM model to learn Pr[lm _t|w],the Probability distribution of Log message value.Finally,the anomaly detection is carried out by using the difference between the predicted content and the real content,and the log message which is quite different from the predicted content is judged as an exception.(3)In order to make full use of the contents of the log in the detection process,this paper constructs the execution path anomaly detection model and the parameter trajectory anomaly detection model respectively,corresponding to the text sequence and parameter sequence in the structured data respectively.The execution path anomaly detection model is used to detect exceptions based on software workflows,while the parametric trajectory model is used to detect the performance anomalies of the software,and two models are combined for anomaly detection after deployment.Finally,this paper uses two datasets of Hadoop and Cloudstack to carry on the experiment.The anomaly detection models based on principal component analysis(PCA),N-Gram and CLSTR were compared.In the CloudStack dataset and Hadoop dataset,the F-measure values of 0.8221 and 0.8128 are obtained respectively,which proves that the proposed method has a larger performance improvement than the traditional anomaly detection method,and verifies the correctness and effectiveness of the proposed method.
Keywords/Search Tags:Anomaly Detection, Log Analysis, LSTM, LCS, Prefix Tree
PDF Full Text Request
Related items