In recent years,with the continuous development of computer science and technology,the complexity of systems and software has continued to increase,and the frequency of system anomalies and crashes has continued to increase.Security personnel need to find efficient ways to locate system anomalies,and log data containing rich information is the focus of their attention.Analyzing logs and mining system anomalies has become an important method in the field of anomaly detection.In the field of log anomaly detection,log parsing and log anomaly detection are the core of this type of problem.In terms of log analysis,the current mainstream log analysis method is to extract log templates for semantic analysis through regular expressions,word frequency extraction or clustering methods.These log analysis methods often have good results for specific types of log statements,but cannot be widely used.It is used for logs with different complex structures,and there are problems of log information loss and poor applicability.At the same time,the current mainstream log anomaly detection methods often only focus on the sequential execution mode between statements,while ignoring the impact of exceptions on the delay and type of logs,which leads to the failure to make full use of important features in log anomaly detection,resulting in The best detection performance cannot be achieved.In view of the above problems,this paper proposes a multi-feature log anomaly detection method based on full log semantics.method.This method removes the variable part of the log through the preprocessing of the log,and then classifies the processed log sentences through the heuristic strategy,and finally clusters the log words through the log prefix tree.The method in this paper improves the efficiency of log parsing through grouping and prefix tree structure,and preserves all semantic information of logs.(2)Aiming at the problems existing in the current log anomaly detection method,this paper proposes a log anomaly detection method based on multi-features.This method combines the semantic feature,time feature and type feature of the log,processes the three features into the feature vector sequence of the log,uses the bidirectional GRU neural network model based on the attention mechanism for training,and learns the behavior patterns of normal and abnormal log sequences.Through the training of this model,abnormalities in logs can be detected more accurately,thereby improving the effect of abnormality detection.(3)This paper evaluates the parsing accuracy of the log parsing method on 8 log datasets,including HDFS,Hadoop,BGL,etc.,and evaluates the robustness of the method in detail on the HDFS log dataset and BGL log dataset.The performance of this log parsing method can be comprehensively evaluated and validated through evaluation on these datasets.The results show that the log parsing method in this paper can fully extract the semantic information of the log,has robustness and high parsing accuracy,and has a good performance on large data sets.Secondly,the log anomaly detection method in this paper has improved accuracy and recall compared with the mainstream methods in the current field,and can achieve 98.1% accuracy and 95.2% recall rate in the HDFS distributed system log data set. |