Font Size: a A A

Design And Implementation Of Fault Location System Based On The Multi-dimensional Indicators

Posted on:2022-02-07Degree:MasterType:Thesis
Country:ChinaCandidate:T X LiFull Text:PDF
GTID:2518306341452004Subject:Electronics and Communications Engineering
Abstract/Summary:PDF Full Text Request
For large-scale online service systems,to maintain a high-quality user experience and service quality,it is extremely important to ensure the stability of the online system.It is also the value of operation and maintenance engineers.Large-scale online service systems often have three characteristics:firstly,the amount of data is large;secondly,the indicators have multiple dimensions;thirdly,the real-time requirement of online systems is very high.For large-scale online service systems,it is difficult to find the occurrence of faults and locate the real root cause set quickly and comprehensively only by artificial monitoring or automatic operation and maintenance system simply by rules.Therefore,Artificial Intelligence for IT Operations came into being,which is also called AIOps.AIOps includes two major topics:one is to quickly and accurately detect the faults in the online system,that is fault detection;the other is to quickly locate the real root cause set of the failure,that is root cause location.Aiming at the above two topics of AIOps,this paper proposes two corresponding algorithms and implements a fault location system for multi-dimensional indicators.This paper proposes a fault detection algorithm based on a variational autoencoder model which is improved by the optimized gated recurrent unit and a root cause location algorithm based on the Monte Carlo tree search model which is improved by explanatory power and potential correlation scores.In terms of fault detection,the fault detection model proposed in this paper uses optimized gated recurrent units under the framework of the variational autoencoder since gated recurrent units can be used to find the correlation between time series.That makes the limitations of traditional variational autoencoder in time series detection have been effectively resolved.In terms of root cause location,this paper proposes a new indicator—potential relevance score and adds pruning based on time series relevance and explanatory power before the start of the Monte Carlo tree search.This paper uses this new indicator to search in the tailored Monte Carlo tree to find the true element combination that causes the faults.Finally,the root cause location of the fault is realized.The above two algorithms are proved effectively through experiments.Based on the above two algorithms,a fault location system for multi-dimensional indicators is designed and implemented.The system consists of three modules:data acquisition module,fault detection module,and root cause location module.It implements a complete process from online data acquisition,to real-time fault detection,and to fast root cause location.Through functional test and performance test,this paper proves the high availability of three submodules of the system.
Keywords/Search Tags:multi-dimensional indicator, fault detection, variational autoencoder, root cause location
PDF Full Text Request
Related items