| With the rapid development of the modern Internet industry, as well as expanding production scale, non-stop operation of the site or the management system generates a large amount of data, the data reflect the production patterns and problems for managers with a great reference value. For log processing transactions or production lines, has become a major demand of the modern enterprise management and production statistics. By grabbing the log file data and analysis process can be monitored the status of a large-scale production line of products, as well as the production of a number of trends towards and related statistics. For a production line managers, through massive log file analysis, to give the product-related statistics to monitor the situation and will be of great practical value and reference value.In this thesis, the production line system log processing to achieve data fetch a mass production line, to automate the processing of data, including data on the production line massive log file extraction, and product statistics, trend analysis, and other data cache analysis and research.This thesis mainly deployed from the following aspects:(1) Overview of Data Capture technology. Http protocol introduces the related technologies, multi-threaded data processing mode, Hash packet data processing, data caching Memcache and other related technologies and their typical applications in data processing and mining areas.(2) Summary log analysis process. This paper processing log analysis were described in detail. Including system boot process, log analysis and statistical process output processes. Each process corresponding to a plurality of modules and systems, covering the mutual interaction between the modules, related to the transfer of data to each other in it. Describes the log files from the production line to collect data extraction, cached output and permanent preservation of the full process.(3) The Design and Implementation of log processing systems. Demand from the system log analysis starting, and then divided by the function chart and modules, detailed analysis and discussion of the interface between the module design, module and associated algorithms. Through the log processing system functionality and performance of the test, showing the operating results of the system.To extract the large-scale production through the use of log data processing technology, integration of existing data processing means, to remove redundant and invalid log data, combined with enterprise business needs, future trends to predict product, while in the past the production process summarize and error correction, the entire process is greatly simplified reached, and eliminating the need for those involved, saving the cost of enterprise management. |