Font Size: a A A

The Design And Implementation Of Mass Data Network Log Analysis System

Posted on:2017-03-23Degree:MasterType:Thesis
Country:ChinaCandidate:J Z WangFull Text:PDF
GTID:2308330509957564Subject:Software engineering
Abstract/Summary:PDF Full Text Request
Accompanide by the rapid development of distributed technologies and network technology, service system produce a large number of data log every day, along with the increasing amount of data, based on data analysis of massive data emerged. In traditional CDN systems, the wide geographical distribution, high network latency characteristics, the traditional data analysis methods can’t meet the existing demand. Under this huge shift, resulting in a massive network traffic log analysis system distributed node architecture.This paper analyzes the limitations of traditional log analysis system in the CDN system, which designed and implemented in massive network traffic log analysis system distributed architecture, which to some extent on the amount of data to solve the problem of poor network conditions surge. It makes this study more practical significance.Design and implementation of massive network traffic log analysis system, the log come from the source 360 cloud CDN log. System is divided into two parts. Real-time statistical analysis part: Real-time statistics and analysis section consists of five parts, each module is the log file monitoring, business analysis module, data transmission module, the module center aggregation and persistent storage module. Log file monitoring module is responsible for monitoring a single server log file, the contents of the log file generated by the synchronization module for business analysis. Business analysis module is responsible for specific statistical and analytical work, the results of a single node server module is sent through the data transmission to a central rollup server, brought together by a central aggregation module, and then there is the persistent storage module sustainable storage, store the result central database server, waiting for the front-end interface display when called. Offline split merge parts: offline split merge section also includes the following four modules, each module reads the log cutting block module, log module synchronized with the central summary sorting module. Log read module reads the log file into blocks of memory, and then split the trip to the cutting block module; cutting block module log file classification by channel cut into different file for each file and the establishment of appropriate index file; log synchronization module will generate a complete block files and index files timing synchronization to a central aggregation log server; Center summary ranking module will sync up the log files and index files merge sort, and package compression by channel archive, stored in server hard drive waiting for users to download.This paper mainly through the above techniques and principles to design and implement the system, each module individually coded and compiled reuse rate with low coupling. After a long period of extensive testing of the entire system functionality and performance testing and analyzing the test results obtained in this system it has high practicability and feasibility characteristics.
Keywords/Search Tags:log analysis, traffic log, huge amounts of data, distributed
PDF Full Text Request
Related items