Font Size: a A A

Spark-based Firewall Log Data Analysis And Mining Platform

Posted on:2022-08-06Degree:MasterType:Thesis
Country:ChinaCandidate:J J LvFull Text:PDF
GTID:2518306509995059Subject:Software engineering
Abstract/Summary:PDF Full Text Request
The network has an irreplaceable role in the information society,but the frequent network security issues cannot be ignored,and the same is true for campus networks that are closely related to teaching and life.Issues such as export bandwidth restrictions and sudden threats will result in serious consequences.The deployment of a firewall can effectively manage the network,and the logs generated by it contain various information.Analyzing these data can help you understand the network status in time.However,the number of logs generated by network devices such as firewalls has grown too fast,and how to efficiently process massive log data is another problem that needs to be solved urgently.Based on the above problems and requirements,this article takes firewall logs as the research object,and develops and designs a log analysis and mining platform based on the Spark memory computing framework.The platform mainly includes four levels: data source layer,data storage layer,and data processing.Layer,data display layer;contains five functional modules: data collection and storage module,data preprocessing module,data analysis module,anomaly detection module,visualization module.This article introduces the overall architecture design of the platform,the design scheme of each functional module,and the implementation process.It makes full use of various components in the Hadoop and Spark ecological environment to complete data analysis and processing,such as using HDFS in Hadoop to complete data storage;using Hive to achieve Query statistics of offline data,and use the Spark memory computing framework to perform analysis tasks;use the Spark Streaming streaming computing framework to process real-time data,etc.At the same time,a multi-class SVM algorithm is proposed for cyber threat events to predict the risk level of threat events,and good results have been obtained.Finally,a result display interface was developed using JSP combined with Echarts technology to visually display the results obtained by the data analysis module and the anomaly detection module.This paper designs a data processing platform based on the Spark distributed memory computing framework,which effectively solves the problems in the log data analysis and mining process,and improves efficiency;multi-dimensional analysis of firewall traffic logs is carried out to help network administrators grasp in time Current network traffic status,so as to better formulate relevant traffic management measures;design anomaly detection algorithms for firewall threat logs,construct a support vector machine prediction model with better classification effects,and complete the risk level classification and prediction of threat behaviors,To help administrators effectively assess the severity of threats and make corresponding countermeasures.
Keywords/Search Tags:Firewall Log, Offline Analysis, Online Analysis, Network Anomaly Detection, Multi-Class SVM Algorithm
PDF Full Text Request
Related items