Font Size: a A A

High-performance Acquisition And Intelligent Analysis Of Large-scale Network Data

Posted on:2021-03-13Degree:MasterType:Thesis
Country:ChinaCandidate:S W SunFull Text:PDF
GTID:2428330632462950Subject:Electronic and communication engineering
Abstract/Summary:PDF Full Text Request
With the rapid development of Internet technology,tens of millions of users access to the Internet and generate massive network data.There are abundant network operation information and user behavior information in the massive network data.They are of great significance to monitor network running status,improve network operation and maintenance management,understand user behavior,and mine user intent.What is flowing and interacting in real-time across network links is binary coding.That is difficult for people to understand.Therefore,it is very important to collect network data and parse it into structured data in a high-performance way.The collected network data has the characteristics of large quantity,complex types and low value density.Only by using big data technology to process and analyze it efficiently can we dig out the value behind it.Firstly,this thesis expounds the framework design of self-developed network traffic collection system and the technology selection of its core modules.According to the characteristics of the collected network data,a reasonable and efficient memory model,a flow table structure and a set of flow record association algorithm are designed.And the performance of the system and the accuracy of the output results are analyzed through experiments.Secondly,this thesis uses Spark to extract the network data collected by the above system at the exit of a LAN into multiple time series representing changes in network traffic.Then three models are used to predict the network traffic,and the experimental results are compared and analyzed.Furthermore,the one-dimensional sequence is expanded to multi-dimensional by adding some external characteristics and uses the dataset composed of multiple time series to train the model with strong generalization ability.It improves the prediction accuracy of the sequence that lacks historical data.Finally,a multi-user oriented network traffic analysis platform is designed and built to solve the problems in the process of data extraction and model training.The platform connects the upstream network data acquisition system and provides it with distributed data storage.It uses containerization technology to isolate the physical computing resources from user layer environment dependence,and realizes the functions of resource allocation according to demand and intelligent resource scheduling.This thesis also realizes the the function of distributed training for above platform and trains the network traffic prediction model on it,which proves the platform can provide effective acceleration ability.
Keywords/Search Tags:Network data collection, Traffic prediction, Distributed, Data analysis platform
PDF Full Text Request
Related items