Font Size: a A A

Research On Algorithms Of Anomaly Detection And Root Cause Analysis In AIOps

Posted on:2021-04-04Degree:MasterType:Thesis
Country:ChinaCandidate:L Y ZhangFull Text:PDF
GTID:2428330647950187Subject:Control engineering
Abstract/Summary:PDF Full Text Request
At present,there are nearly 100 billion devices in the world.These devices carry countless services,covering the Internet,finance,intelligent manufacturing and other aspects.The scale of IT systems of various enterprises continues to expand,and traditional IT operations have been unable to meet their digital transformation needs.In 2017,AIOps(artificial intelligence for IT operations)was first proposed,which means high efficiency and low cost.It is predicted that 40% of IT operations will be transformed into AIOps by 2022.The era of AIOps is coming.Two key scenarios and technologies in AIOps are studied: anomaly detection,and root cause analysis.(1)For metrics data in IT operations,the time series anomaly detection algorithm is studied.Usually,it is difficult to obtain anomalous labels and prior knowledge about time series;constantly updated IT systems require real-time performance of algorithms;and the anomalous manifestations of time series are diverse.An online unsupervised anomaly detection algorithm based on granular computing and optimized kernel density estimation is proposed.Based on the Numenta Anomaly Benchmark,the proposed algorithm performs better in most cases compared with five typical algorithms,i.e.,with higher detection rates,lower false alarm rates,and earlier detection for contextual anomalies and concept drifts.(2)For concurrent alarms generated during the anomaly detection process,the root cause analysis algorithm is studied.With the development of the Internet,IT systems are getting increasingly complex.Once a local anomaly occurs,it tends to spread,triggering dense concurrent alarms.A root cause analysis algorithm based on anomaly propagation graph is proposed,including two sub-algorithms: random walk and state iteration,to track anomaly propagation process and locate the root cause.Compared with three typical algorithms,the algorithm can localize root causes more correctly and rapidly for scenarios with complex call chains and resource competition,and is more robust to alarm error.
Keywords/Search Tags:Artificial Intelligence for IT Operations, Time Series, Anomaly Detection, Kernel Density Estimation, Root Cause Analysis, Random Walk
PDF Full Text Request
Related items