Font Size: a A A

The Design And Implementation Of Network Data Analysis System Based On Spark Platform

Posted on:2018-03-07Degree:MasterType:Thesis
Country:ChinaCandidate:Y J LinFull Text:PDF
GTID:2348330518495694Subject:Computer technology
Abstract/Summary:PDF Full Text Request
With the rapid development of Internet technology, content delivery network (CDN) plays an important role in the Internet architecture, and the user's Internet access record is also recorded in the CDN service provider's Web log. The CDN manufacturers have some general need for analysing massive network data, their PM managers, operators and other non technical personnel need to do some general data analysis from these networks. For CDN service providers, there is currently a lack of a universal network data analysis services platform. Therefore, there is an urgent need for the major CDN vendors to provide a generic network data analysis service platform that has not threshold to use the big data platform.In order to design a service platform for analyzing massive network data, which is universal, easy to operate and easy to expand, this paper designed and implemented a network data analysis service platform based on Spark platform using the existing distributed framework. The main researches are as follows: (1) Based on Spark big data technology,realizing pretreating and processing the massive network data. According to the characteristics of network data, this paper designs and implements a network data analysis service tool; (2) Researching based on Web technology for large data platform. This paper mainly studied how to browse the network data in the distributed storage engine on the Web platform, and how to implement the massive network data analysis task on the Web platform; (3) Based on management mechanism of Yarn on the big data platform, this paper analysed the relationship between the Yarn explorer and Spark calculation engine, study how to realize the monitoring of large data platform in the Spark task by monitoring the Yarn, so as to ensure the availability of the whole system platform; (4)The visualization of large data analysis results is studied. Through the research of the third party visualization plug-in, the paper introduced the Echarts to present the big data analysis result to the page.According to the solution of the related technology research, this paper realized the function of data analysis based on Spark platform and the Web of big data platform, and verified the effectiveness of these functions and the platform through experiments. Based on the above key technical solutions, this paper completed the network data analysis service platform, provided some related analysis functions for the users,the network data preview function, data visualization, system monitoring function, provided a platform to grasp the user's behavior of surfing the Internet, but also created the conditions of optimizing their services for each website providers and CDN vendors.
Keywords/Search Tags:Spark, Yarn, data analysis, network data, data visualization
PDF Full Text Request
Related items