Font Size: a A A

Performance Analysis And Evaluation Of Large-scale Network Traffic Analysis System Based On Hadoop

Posted on:2017-02-10Degree:MasterType:Thesis
Country:ChinaCandidate:R TaoFull Text:PDF
GTID:2348330518495914Subject:Information and Communication Engineering
Abstract/Summary:PDF Full Text Request
With the booming of computer technology and the mobile Internet technology,we have entered the era of big data.As an advanced big data processing tool,Hadoop provides an excellent user experience with simple parallel programming model,large data storage capacity and efficient computing capability,therefore it has become the first choice for most enterprises and scientific research.Hadoop-based large-scale network traffic analysis system stores massive network traffic data and runs off-line analysis applications over these data.The efficiency of applications is directly determined by the performance of analysis system.Thus,how to effectively and accurately analyze and evaluate the performance of analysis system has become the significant key to effectively guide the user behavior and improve performance.In this thesis,firstly we introduce the basic knowledge of Hadoop,and give an overview of main influence factors and recent research results of Hadoop performance.Combining with the actual situation of analysis system,we propose a performance analysis and evaluation scheme.Then we explain the reasons for choosing the running times of three benchmark applications as the evaluation benchmark of the performance of analysis cluster.Afterwards,we list selected performance data and introduce data collection system based on Flume.Moreover,we introduce basic content of two kinds of modeling approaches,linear multiple linear regression and nonlinear BP neural network,which would be used to modeling the evaluation benchmark and performance data.Finally,we analysis and evaluate the performance of analysis system through the two validated models,comparing their pros and cons,then guide user behavior and optimize the performance of the system by the analysis and evaluation results.
Keywords/Search Tags:performance evaluation, hadoop, multiple linear regression, BP neural network, execution time prediction
PDF Full Text Request
Related items