Font Size: a A A

Problem Research And Platform Development Of CDN Massive Log Real-Time Analysis

Posted on:2020-09-01Degree:MasterType:Thesis
Country:ChinaCandidate:Z H XuFull Text:PDF
GTID:2428330602950482Subject:Control theory and control engineering
Abstract/Summary:PDF Full Text Request
Content Delivery Network(CDN)was born in the 1990 s.A CDN consists of a number of node servers and a global intelligent load-balancing system.It is used to solve network congestion problems and provide network acceleration services.It is an important building block of the Internet.The request information and response information of the acceleration service are recorded in CDN logs.Real-time analysis of the logs of CDN node servers,mining the information and value contained in the logs,real-time detection and monitoring of core indicators in CDN acceleration services can provide important and reliable data support for accelerating service quality improvement strategies.Since a CDN is a huge distributed network,the average data access per second of a CDN acceleration platform can reach the level of tens of millions,and the amount of log data generated can reach GB level per second,and PB level per day.How to design a low-latency,high-throughput CDN massive log real-time analysis platform for real-time analysis of massive CDN logs is an urgent problem to be solved in the current CDN research field.This paper analyzes the difficulties in real-time analysis of massive CDN logs and the problems existing in the current solutions,and proposes corresponding design schemes and architectures for these problems,and finally forms the architecture of a low-latency,high-throughput real-time CDN log analysis platform.The difficulty of the real-time access to the log calculation engine for the massive CDN logs is inconsistent due to the poor scalability of the log receiving process and the low efficiency of the log traffic switching in the current solution.Using the current load balancing technology,micro service registration and discovery technology,the log receiving component ARRIS is designed which can linearly expand the log receiving processing capability and intelligently switch the log traffic;Aiming at the problems of the writing spike existing in the computing engine and the analysis result storage platform,inspired by a decoupling idea,a data interaction mechanism AA between the calculation engine module and the analysis result input module is designed.In addition,using the current mainstream streaming computing framework and messaging system,a stocking component Anti-Flood with a current limiting and fusing function is designed based on the AA data interaction mechanism.Coping with the difficulty of real-time access to the massive analysis results data and the problems of high operation and maintenance cost,low throughput,low-time query efficiency in the current My SQL sub-database sub-table scheme,this paper classifies the CDN log analysis results according to the features of the results,and with the current existing data storage technologies,a real-time data storage platform with high throughput,RTDP,is designed.At the end of this paper,ARRIS,Anti-Flood and the whole platform of the massive CDN real-time log analysis are verified.The verification results show ARRIS has good real-time log reception and pretreatment processing capacity.Anti-Flood is a good implementation of decoupling between the calculation engine and the storage module,and has a good storage capacity and current limiting function.The massive CDN real-time log analysis platform can analyze the massive logs of a CDN in real time,and can track and monitor the quality of CDN-accelerated services in real time.
Keywords/Search Tags:CDN, Log Analysis, Real-Time Analysis, Big Data
PDF Full Text Request
Related items