| Since entering the information age,with the rapid innovation and popularization of network technology,the number of network users has exploded,and people have put forward high standards for low-cost,high-performance,and high-security log systems.The architecture design of many Internet applications uses hierarchical,distributed clustering,and caching.These large-scale Internet architectures and the deployment of numerous servers make system operation and maintenance and anomaly detection more and more challenging.Many traditional anomaly detection methods based on mining methods will no longer be applicable.Therefore,a quasi-real-time log system that can collect,analyze,and visualize logs and provide early warning of abnormalities generated by the system in a timely manner is important for ensuring the safety of system operation.Reliability,while reducing log maintenance investment is of great significance.This article uses the logs generated by a commercial server cluster as the verification data set,combined with the Internet company’s log collection and analysis requirements,design and implement the massive log collection based on the three open source software(ELK)of Elasticsearch,Logstash and Kibana With the analysis system,log collection,anomaly detection and visualization are realized.The work of this paper mainly includes the following aspects:(1)Research on mass log collection,storage and indexing technology,design and implement a quasi-real-time log analysis system based on ELK.Analyze the characteristics and analysis methods of various logs,introduce Logstash to collect cluster logs non-invasively,and use Elasticsearch to store and index logs,which can be easily scaled horizontally to efficiently analyze logs on various scales.(2)Research the anomaly detection methods proposed in recent years,design and implement an anomaly detection model based on Recurrent Neural Networks(RNN).The model learns the log pattern from the existing log data set;and incrementally updates the model parameters in an online manner,so that it can adapt to the new log pattern over time.(3)Research on the visualization technology of logs,design and realize the dynamic display of comprehensive panels of various data sources,and improve the log processing level and fault resolution capabilities of the enterprise.To retrieve the collected logs,follow Query specific conditions,and can generate types of data such as tables and graphs. |