Design And Development Of Lightweight And High-performance Log Collector

Posted on:2022-12-11

Degree:Master

Type:Thesis

Country:China

Candidate:M H Zhou

Full Text:PDF

GTID:2518306773497814

Subject:Library Science and Digital Library

Abstract/Summary:

PDF Full Text Request

With the rapid development of distributed systems and the continuous horizontal expansion of complex applications,logs are scattered on many different machines.And the number of logs generated by various applications in the process of running are growing exponentially,which brings new challenges to the collection,storage and analysis of logs.At present,the common log collection schemes in the market can not solve the problems of poor log analyzability,poor performance,unreliability and poor expansion at the same time.Among them,no breakthrough has been made in performance,which can not meet the needs of rapid business expansion.Therefore,this paper focuses on the performance problems and ensures the reliability,analyzability and scalability,which provides a better solution for log collection in specific scenarios.On the issue of analyzability,this paper structures the log data through JSON serialization to facilitate subsequent analyzability;The log sorting field is added to ensure the order of logs within milliseconds and facilitate troubleshooting.On the issue of scalability,the log data in MDC is automatically obtained when assembling the log data.Each business line can put customized attributes into MDC according to their own needs,so as to facilitate the expansion of log content;Kafka and Pulsar are distributed and horizontally scalable components to solve the scalability problem of log center service.In terms of reliability,when the log center is unavailable,the log can be temporarily stored locally;the heartbeat mechanism is used to avoid log loss caused by asynchronous network transmission;the sending confirmation mechanism is used to avoid the loss of logs in memory due to service restarts problem,but only if that in-memory log is read from disk.This paper implements a log persistent file system for log local storage,which can provide temporary storage and reading of logs,support the use of different compression algorithms for logs,ensure the sequentiality of log reading,support the configurable size of a single log file and the number of log files,and support self retrieval after service restart and locate the index position of the last read and write.For performance problems,batch technology is introduced in network transmission to save network IO overhead;Disruptor high-performance queues are used to replace blocking queues that come with JDK,which can provide better performance;JSON serialization method is rewritten to reduce intermediate processes and temporary object creation;using memory reuse technology to reduce the overhead of memory development and destruction;using zero-copy technology to reduce the overhead of memory replication.The customimplemented JSON serialization tool in this article can achieve zero copy and zero GC.This tool supports the formatting of timestamps,the interception of longer strings,and the reuse of byte arrays.The rewritten time stamp formatting improves the performance by 7 times compared with the date formatting tool of Java.After testing,the performance of the system has improved significantly.Compared with the Filebeat's file collection method,the throughput has increased by 18 times,and compared with the Github's network collection method,the throughput has increased by 5 times.On a common machine such as 4-core 8G,the maximum log printing speed can be supported to 1million.The log collector designed and implemented in this paper has been put into use in the company's application system,and the use effect is good.

Keywords/Search Tags:

Log Collection, High Concurrency Scenario, High-performance Memory Queue, Log Serialization, Garbage Collection(GC) Problem

PDF Full Text Request

Related items

1	Research On Garbage Collection Based On ART
2	Garbage collection scheduling for Java applications
3	Research On Key Technologies Of Garbage Collection For Multicore System
4	High-performance copying garbage collection with low space overhead
5	Research On Low Power Garbage Collection Of Java Memory Management
6	Garbage Collection Strategy Research Based On Mixed Cache Mechanism
7	Memory management and garbage collection algorithms for Java-based Prolog
8	The Framework For Object Persistence In High-level Languages Based On Non-volatile Memory
9	Non-compacting memory allocation and real-time garbage collection
10	Research On Cache Replacement And Garbage Collection Algorithms For Flash Memory System