Font Size: a A A

Research And Implementation Of Log Management Module In Cloud Platform

Posted on:2018-09-03Degree:MasterType:Thesis
Country:ChinaCandidate:Z M PuFull Text:PDF
GTID:2348330512984914Subject:Engineering
Abstract/Summary:PDF Full Text Request
With the development of cloud computing,the existing cloud computing platforms all have a complex structure.The log data generated by each node and component module in the platform is huge and scattered in different places,and each log source independently generates log data of different formats,so that the utilization rate of log data is low and we can not obtain any valuable information from the existing log data.This has made the development and maintenance of cloud platform become difficult.For the current situation and demand,this thesis mainly researches on the management and analysis of log data in cloud platform.So we designed a log management system which is adapt to the cloud platform,and also implemented it.The system has the function of log collection,log analysis and final results show.Therefore the log system is composed of log aggregation module,message queue module,log analysis module and the results show module.Each module has been designed and implemented in detail,among which the log aggregation module and the log analysis module are the two most important modules.Considering of the complex architecture of cloud platform,a distributed log collection module is proposed on the basis of Flume to meet the requirements of high availability,high reliability,scalability and real-time.This thesis has made some changes about the shortcut of flume,including functional improvements and performance optimization.Functional improvements:(1)make the HdfsSink create index automatically,so that the file written to hdfs can be segmented according to the index,so as to facilitate the parallel data processing;(2)choose the proper channel automatically,so that the log data in the service can automatically select different channels to write to according to the current situation,that is,to switch between the memory channel and the file channel automatically,which improves the throughput and stability.Performance optimization:(1)adjust the basic parameters of Flume according to the actual needs;(2)divide HdfsSink into some different parts to improve the efficiency of data writing.This thesis proposes two methods of log analysis in the log analysis module,including distributed call tracing method and fault log association analysis method based on time sliding window.Distributed call tracing method provides a way to build a call graph in a distributed system,to create a complete path for a particular request in a distributed system,to calculate the time-consuming of a particular request,to calculate the time-consuming of each module in the call graph and to debug in distributed system.Fault log correlation analysis is based on the existing classical association analysis algorithm,which is based on the concept of time sliding window,and it has been used to analysis the correlation between fault log.In the design and implementation,some changes has been made to avoid two typical errors which are the conflict and the truncation.The final output of the fault log correlation analysis is a fault propagation tree,which can be used to predict fault and locate fault.Finally,the system has been tested in many aspects.Experiments show that the log management system designed in this thesis can adapt to the distributed system,such as cloud platform,and the system can achieve high throughput,high reliability,low latency and high efficiency.Also the system can predict fault and locate the soruce fault,which have provided valuable information to system administrator.
Keywords/Search Tags:cloud computing, log system, fault correlation analysis, distributed call tracking
PDF Full Text Request
Related items