Font Size: a A A

The Design And Implementation Of Log Analysis System For Distributed Application Software

Posted on:2019-04-29Degree:MasterType:Thesis
Country:ChinaCandidate:J LiFull Text:PDF
GTID:2428330590460055Subject:Engineering
Abstract/Summary:PDF Full Text Request
With the rapid expansion of the data scale of Internet,and the complex diversification of the types of service,the service of enterprises is gradually transferred from a stand-alone software system to a distributed software system.At the same time,the complexity of system maintenance increases exponentially.A large number of scattered logs are not easy to manage,the efficiency of manual retrieval of key abnormal information in the log is too low.And useful information may be missed,a large number of valuable log information is not fully utilized.A log analysis system for distributed application software is designed and implemented to solve operation difficulties in distributed application scenarios in this thesis.The log analysis system is divided into two modules: log aggregation and log analysis.The log aggregation module consists of three parts: log collection,message queue,and distributed storage.Log collection uses the open source component Flume and develops a custom channel DoubChannel according to the real application scenario.DoubChannel can make data free switch between memory channel and file channel.Kafka message queue is used to cache data in log acquisition and storage to avoid performance problems.Log distributed storage uses the distributed search engine Elasticsearch to provide data for log analysis module.The log analysis module consists of four parts: online task management,log clustering analysis,correlation analysis and exception scenario analysis.Online task management is responsible for the opening and closing management of the whole module.Log clustering analysis gets log from Kafka,generates log template library by IPLoM and DBSCAN.Correlation analysis firstly gets the log and log template from Elasticsearch and database to generate the log distribution baseline.Secondly,real-time log data are analyzed by using boxsplitting algorithm,quantile algorithm and KSigma model according to template library and distributed baseline to get real-time window data.Anomaly scene analysis identifies anomaly markers and identifies faults in successive anomaly windows by Quantile algorithm and LCS algorithm.The visualization part of the system gives a clear display of log analysis results and detailed information.Functional testing and performance testing show that the system can quickly detect anomalies and identify faults,give fault causes and emergency plans.The system can help enterprises make operation and maintenance more convenient.
Keywords/Search Tags:Log system, Cluster analysis, Correlation analysis, Anomaly identification
PDF Full Text Request
Related items