Font Size: a A A

Design And Implementation Of Operation And Maintenance Data Anomaly Detection System For Private Cloud Services

Posted on:2023-01-20Degree:MasterType:Thesis
Country:ChinaCandidate:Z C WangFull Text:PDF
GTID:2568307298979189Subject:Computer technology
Abstract/Summary:PDF Full Text Request
Nowadays,cloud services have been widely used in various industries.But neither Private Cloud nor Public Cloud has the same problem,which calls the service fragmentation.Cloud services are independent of each other,like Infrastructure-as-aService,Platform-as-a-Service,Software-as-a-Service.Also the scale of the system in cluster is extremely large.When any cloud service failed,the O&M(Operation and Maintenance)person can only view and repair the problem in current cloud service.And cannot troubleshoot the problem in multiple dimensions among the cloud services.Public Cloud has uncertainty service by multi-user and the architecture complexity is much higher than that of Private Cloud.To discuss the anomaly detection of O&M data in lightweight Private Cloud.In order to eliminate a fault in a very short time after the occurrence,or even to discover the upcoming problem.It is necessary to use the logs of service cloud or system.Therefore,a platfom is need to maintain a set of cloud services and virtual machines.Also the platform corresponding to the location information,collect and transmit these logs to database.Then the anomaly detection system will preprocess the uniformly stored logs.And complete intelligent analysis,then alert when abnormalities were found.Based on the above objectives,to design and implement an O&M data anomaly detection system.Which based on the combination of AIOps(Artificial Intelligence for IT Operations)and CMDB(Configuration Management Database).The specific methods and results are below.a)CMDB part: Researching various systems in the Qing Cloud Pivate Coud.According to the company’s existing architecture,determine the services and log locations which need to be detected.Store them in the "Blue King CMDB".Focus on the iconic services in the Private Cloud and their log collection solutions.Study the node management module of "Blue King CMDB",use API interfaces to collect log location,service,cluster structure and other info which stored in CMDB.Study the log channel from collect,store,and transmit log files to the Elasticsearch database.Study the method of maintaining log sequence during the reading process of Elasticsearch.b)AIOps part: Researching the Log2 vec and the Deep Log algorithms.Focus on the Neon SAN log set in Qing Cloud Private Cloud.By combined the log preprocess and deep learning,the Neon SAN log model is obtained.This model get 96.96% F1(F1-Measure)for training set,when choose TOP3(next line has three results)results.This model get 92.8% F1 for verification set,when choose TOP3 results.In public sets,this solution also has stable advantages compared with other algorithms like PCA,Invariants Miner,Log Clustering,AE and Transformer.The F1 of the HDFS(Hadoop Distributed File System)set is 90%.The F1 of the BG/L(Blue Gene/L)set is 95%.c)System part: Realizing the separation of offline log training and O&M data anomaly detection system.Design a separate submission,both offline training system and O&M data anomaly detection system are decoupled from each other.Complete log online detections,alerts and root cause analysis.The O&M data anomaly detection system relies on the "Blue King CMDB" and AIOps to achieve a closed-loop management of O&M data.Which contains services,logs,detections,alerts,and feedback.Finally it realize early warning of possible problems of system in Private Cloud.
Keywords/Search Tags:anomaly detection, configuration management database, Qing Cloud private cloud, log parser
PDF Full Text Request
Related items