Font Size: a A A

Design And Implementation Of AIOps System Based On Spark

Posted on:2021-02-25Degree:MasterType:Thesis
Country:ChinaCandidate:Y J ShiFull Text:PDF
GTID:2518306020958069Subject:Control Engineering
Abstract/Summary:PDF Full Text Request
Operation and maintenance is a key to ensure the normal operation of enterprises.Especially Internet enterprises,operation and maintenance is an important part of their services and work.In essence,operation and maintenance is the operation and maintenance of the network,server and service at all stages of.the life cycle.AIOps,which is called Artificial Intelligence for IT Operations.Based on the large amount of data generated by operation and maintenance,and combined with big data,machine learning,deep learning and other technologies,intelligent operation and maintenance scenarios is the development trend of operation and maintenance industry.Generally speaking,aiops is an intelligent process of making operation and maintenance rules,that is,the traditional process of summarizing operation and maintenance knowledge in related fields through experts is upgraded to a process of automatic machine learning and generation.Specifically,aiops is the operation and maintenance and management knowledge accumulated for a long time in the specific actual production environment.It can automatically learn the formation process of its rules to "de regularize" and avoid human participation to the greatest extent.So as to achieve the best quality,cost and efficiency,and strive for the maximum profit for the enterprise.With the rapid development of information technology in various industries,the business of enterprises is expanding and complicated,and the scale of IT system and the data it needs to process is also expanding.In the current monitoring deployment of enterprise operation and maintenance scenario,the server needs to monitor a large number of KPIs with different characteristics.If technicians deploy different alarm thresholds for each KPI based on their experience,it will easily lead to errors.Moreover,the characteristics of many KPI indicators will change due to business changes.A single alarm threshold configuration is difficult to cope with this situation,which will produce many false alarms or omissions.The research goal of this paper is to design an intelligent operation and maintenance system to assist the operation and maintenance personnel to solve these problems.This paper studies the application of traditional time series analysis technology.machine learning technology and big data technology in aiops.Based on the time series data generated in the actual operation and maintenance of the enterprise.the data preprocessing method of time series data trend prediction is sorted out,and the method that can effectively extract features from time series data is summarized.This paper uses the time series data generated in the actual production environment of the enterprise,and based on the mature big data computing framework spark,message queue Kafka,key-value database redis and distributed database Druid,as well as the traditional time series analysis and machine learning technology,builds a baseline detection with basic data,real-time prediction of business data and industry Aiops system with abnormal data detection function.
Keywords/Search Tags:AIOps, Machine Learning, Time Series, Anomaly Detection
PDF Full Text Request
Related items