Font Size: a A A

Log Sequence Anomaly Detection Based On Local Information Extraction And Globally Sparse Transformer Model

Posted on:2022-06-25Degree:MasterType:Thesis
Country:ChinaCandidate:H Y ZhangFull Text:PDF
GTID:2518306569497434Subject:Computer technology
Abstract/Summary:PDF Full Text Request
Anomaly detection for log sequences is a necessary task for intelligent system operation and fault diagnosis.The current mainstream method is an unsupervised approach based on template prediction,which is implemented with the idea of converting anomaly detection into a sequence prediction problem for log templates.Transformer learns the global information of sequence data through the self-attention mechanism,which has been proven to be effective in sequence prediction,but still has the following shortcomings.First,in the sequence,the local correlation between adjacent items and the remote dependence between long distance items have a great influence on the prediction results,but Transformer ignores the local correlation information between adjacent items.Second,there is some information in the sequence that is irrelevant for the prediction task,which may be noise.It does not contribute to the prediction effect,and may even have negative effects,but Transformer does not eliminate this part of information.To solve the above problems,this dissertation investigates the local information extraction and globally sparse Transformer model.The model uses multi-layer convolution to capture locally relevant information between adjacent items and then fuses it into the Transformer to learn the global dependen cy information of the sequence.It can solve the problem of local transformation between adjacent item s and complex transformation between distant items,compensating for the shortcoming of the Transformer model in local information extraction.At the same time,in order to reduce the influence of noise in the sequence on the prediction results,this dissertation improves the self-attention mechanism in Transformer and proposes the global sparse Transformer model.The model converts the global attention into global sparse attention by introducing a sparse function,which adaptively selects to retain important information to improve the attention concentration of the global context,while deleting irrelevant information to achieve the purpose of eliminating noise.The experimental results show that the improved model achieves better results in the sequence prediction task compared to Transformer,with its MSE values reduced by4.1%,5.4% and 12.0% and MAE values reduced by 16.9%,6.0% and 26.2% on the three sequence datasets respectively.Based on the above research work,this dissertation proposes a log sequence anomaly detection framework based on template prediction,LSADNET,which is implemented in three parts: log template extraction,log vectorization and log template prediction.In log template extraction,this dissertation proposes an improved log template extraction method based on streaming clustering,which combines literal similarity and semantic similarity to measure log text distance,and uses sliding window matching to update log templates.It achieves the effect of completing log clustering and extracting log templates by scanning log data only once.In log vectorization,this dissertation designs a formula for calculating the log template transfer value based on the co-occurrence pattern of log templates,and combines it with log template semantic embedding and log key embedding.In log template prediction,the local information extraction and global sparse Transformer model are used to achieve anomaly detection for log sequences.The experimental results show that LSADNET performs better in log sequence anomaly detection,with its F1 values improving by 0.25% and 1.37% over the best method on HDFS and BGL log datasets respectively.
Keywords/Search Tags:local information extraction, globally sparse Transformer, log anomaly analysis
PDF Full Text Request
Related items