Font Size: a A A

Design And Implementation Of Log Collector Component

Posted on:2020-04-26Degree:MasterType:Thesis
Country:ChinaCandidate:Z T ChenFull Text:PDF
GTID:2428330590450622Subject:Software engineering
Abstract/Summary:PDF Full Text Request
In the era of big data,log data is an important part of massive data,and its importance has received more and more attention.As a very important output method,it records various system behavior trajectories and is used in system troubleshooting,abnormal tracking,system monitoring,intrusion detection,application traffic assessment,application performance optimization and data mining,etc.One of the important sources.The emergence of distributed systems,the horizontal expansion and vertical expansion of complex systems,the continuous development of technologies such as system clustering and containerized deployment have led to a massive increase in logs,and also brought many challenges to the collection,storage and analysis of logs.The contradiction between multi-environment,diversified log collection requirements and efficient collection restricts the development of log collection.Because Aparche Flume and Filebeat have various flaws in log collection,it is critical to design and implement a lightweight,versatile,and efficient log collection solution.The core of the log collection system is the distributed log collector component,which is deployed on each machine of the cluster nodes that need to be collected.The focus of this paper is to design a logCollector,a lightweight log collector component with real-time performance,dynamic configuration,high reliability and monitorability,to solve the problem of massive log collection of cluster nodes.Through the investigation of the Docker container,the log file storage principle under the physical machine,the log file dynamic generation update and other event trigger mechanism,understand the log file generation and writing process.The objects collected by the log are mainly the log files of the physical machine and the container machine under the cluster.The abstract log collection requirement is the log model,the abstract log collection is the log event,the multi-threading technology is used for the acquisition,the read-write synchronization lock technology is used to store the acquisition offset,and the Linux kernel inotify function is used to realize the dynamic monitoring of the collected files.Logs are sent to the configured Kafka cluster in real time based on the Producer mode provided by Kafka(Distributed Message Queuing)for log collection.Based on the purpose and significance of LogCollector,the paper analyzes the requirements according to the actual situation of the research,designs the whole architecture,functional modules,focuses on the implementation process of the functional mode,evaluates the functional test results and performance test results of LogCollector,and finally summarizes this time.Research results.
Keywords/Search Tags:Big data, Distribution, Log collection, Container file collection, Log collector
PDF Full Text Request
Related items