Font Size: a A A

Research On Anomaly Detection Based On Spatio-temporal Feature Mining

Posted on:2024-01-25Degree:DoctorType:Dissertation
Country:ChinaCandidate:Z C HuFull Text:PDF
GTID:1528307376981059Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
With the rapid development of Internet technology,massive amounts of data are rapidly generated and accumulated,and network security and data security are facing enormous challenges.People are increasingly concerned about the time dimension characteristics of data in the process of data analysis and use.Time series can effectively model historical data and extract valuable information and knowledge,thereby mining temporal patterns and assessing data status.Therefore,analysis of time series data is closely related to security defense,monitoring analysis,and data mining.It is widely used in applications such as intrusion detection,fault detection,log analysis,data cleaning,and system monitoring.Undetected data anomalies and inefficient anomaly fixes can lead to significant resource consumption and losses.Anomaly detection is an effective way to find data anomalies,while root causes localization can provide an effective basis for inferring the main causes of anomalies and accelerate the repaire of abnormal problems.As the demand for high-performance anomaly algorithms in scenarios such as security protection has grown stricter,researches and techniques on anomaly detection and root causes localization are faced with many challenges,including:(1)Due to the diversity of time series data,there are hidden sequence characteristics,complex and dynamic data distribution problems.This makes it difficult to effectively mine sequential patterns and learn data distribution;(2)The correlation between time series introduces implicit context,which increases the complexity of analysis;and(3)The diversity and richness of abnormal events lead to complex causes of abnormalities and a huge space for root cause search and places high demands on performance,compatibility,and interpretability.This thesis studies several key issues in anomaly detection and root causes localization based on the security defense and monitoring analysis scenarios,which are of high importance and urgent practical needs.The contributions of the thesis are summarized as follows:Firstly,this thesis addresses the modeling problem of the complex and dynamic distribution of time series data.An anomaly detection method based on data distribution learning is proposed.On the one hand,based on generative adversarial networks,the effectiveness and robustness of anomaly detection are improved by capturing the uncertainty of deep learning models and generating conditional distributions of time series data.On the other hand,based on online learning,the Knearest neighbor Gaussian mixture model and a dynamic contextual scheduling strategy are applied to achieve adaptive anomaly detection for streaming data.Experiments show that the proposed method is effective and stable for anomaly detection in temporal data and is highly adaptive to the dynamic changes in data distribution in online scenarios.Secondly,this thesis studies the problem of prior knowledge fusion and relationship mining of time series.A sequence mining-based anomaly detection method is proposed and applied to two scenarios: host intrusion detection and multistep attack detection.On the one hand,the collaborative relationship between system calls is exploited to construct the call relationship graph,and the graph representation is used to learn to generate an effective embedding representation of sequences,which improves the accuracy of anomalous sequence classification and identification.On the other hand,we analyze the relationship between security alerts and network attacks,correlate the alerts and improve the mining algorithm to achieve the mining and detection of attacks.The experiments show that the proposed method can effectively improve the effect of sequence mining and the accuracy of sequence anomaly detection.Thirdly,this thesis studies the problem of relationship mining and anomaly detection of multi-dimensional time series.A relationship mining and anomaly detection method for multi-dimensional time-series data is proposed.The method constructs a sequence relationship graph based on the similarity of nodes,applies pruning strategies to ensure simplicity,and uses causal inference methods to establish sequence causality graphs as constraints to make it causally valid.Through the combination with neural networks,end-to-end automatic sequence relationship mining is achieved,which reveals the implicit spatial contextual information.Based on the multi-dimensional temporal data after relationship mining,the method designs a spatio-temporal attention mechanism to handle spatio-temporal context for prediction as well as anomaly detection,which improves the effectiveness and stability of anomaly detection for multi-dimensional temporal data.Comparative experiments show that the proposed method is effective for sequence relationship mining,and the anomaly detection effect for multi-dimensional temporal data is better than the existing methods.Finally,this thesis studies the problem of locating and explaining the root cause for multi-dimensional anomaly events.A root causes localization method based on event propagation is proposed.The method proposes a unified event description framework for the propagation and hierarchical aggregation of anomalous events and designs a quantitative method suitable for different events to evaluate the impact of events on anomalies.By comprehensively considering the behavioral consistency of abnormal events and the numerical significance of deviation,the search space and search strategy of root causes events are effectively optimized.The method explains the root causes location results from two aspects: the proportion of abnormal contributions from events and the path of event propagation and trace.Experimental results show that the proposed algorithm achieves better root causes localization accuracy than existing methods,and the execution speed is improved.
Keywords/Search Tags:anomaly detection, spatio-temporal feature mining, root cause localization, data mining, machine learning
PDF Full Text Request
Related items