Font Size: a A A

Research On Network Anomaly Traffic Based On Distributed Detection With Hadoop

Posted on:2020-12-07Degree:MasterType:Thesis
Country:ChinaCandidate:X L MaFull Text:PDF
GTID:2428330599456761Subject:Computer system architecture
Abstract/Summary:PDF Full Text Request
With the rapid development of the computer and network technology,the application of computer and cyberspace is becoming more and more widespread,because of the openness and sharing of the Internet are becoming more and more obvious,It plays an increasingly important role in politics,economy,financial,educational and mmilitary fields.Because of the vulnerability of the computer and network,the defects of the network protocol and the hidden security vulnerabilities,it poses a great threat to the security of the network.Network traffic data and web logs contain many of valuable information,so which has a very high value in many fields of intrusion detection and network management,analysis of user behavior in network data processing,the traditional method is process using a single node,it is very limited in the performance of CPU,I/O and storage of single node and there is no scalability.In the face of large-scale and high-speed network,the traditional detection method can not meet the large-scale data analysis on requirements of the time and efficiency,request response time for data analysis in practical application is more and more high,real-time and high throughput parallel computing has become the basic needs of network data processing,research on distributed anomaly detection has become a new research direction of anomaly detection field.Because network security situation is becoming increasingly severe and the rapid growth of massive network data in the background,this paper presents the design and implementation of distributed network traffic anomaly detection based on the experiments of cloud computing,the ability to powerful storage and calculation of large-scale data through the cloud computing,solve the main problems of network data collection,storage and analysis,using Hadoop distributed parallel MapReduce model processing,which can efficiently and accurately parallel processing of large data sets.The main contents and results of this paper are as follows:(1)Research on the architecture of distributed anomaly detection platformThrough the demand analysis to complete the design of distributed intrusion detection platform architecture,the overall architecture will be divided into three layers,such as network collection,data storage layer and distributed abnormal data analysis layer three levels,according to the needs to build a complete network experimental environment of network data collection,data storage and data anomaly detection,Introducing the concept of anomaly detection feature library can improve the data processing ability of the anomaly detection system,improve the detection efficiency and detection accuracy of the system can be extended,the learning ability of mining deeper analysis information.(2)Research on network data sniffer and network log collectionFlume is used to store the network log data which is collected from the front-end server with multi-source into the HDFS distributed file system,Using Sniffer technology to collect network traffic and extract the feature of network traffic,front-end with WinPcap and LibPcap are used to collect network data,extracting network traffic and storing feature in the way of session connection reconfiguration,the KDD99 format feature data is transmitted to the HDFS file system of the analytical processing system.(3)Research on Data Analysis AlgorithmsThe fuzzy clustering algorithm,classification algorithm and statistics method are applied to the collected network traffic and web log data,Verify the feasibility of the algorithm and the accuracy of the analysis,the clustering analysis was completed by preprocessing network data and using fuzzy C-means clustering algorithm,obtain cluster cluster center value and cluster type from training sample data,using abnormal variance to detect distributed denial of service attacks and build the historical feature library to meet the demand for the rapid analysis of future data.(4)Research on distributed compute network anomaly detectionThe data mining algorithm and distributed computing model of Hadoop are combined for parallel processing,using HDFS distributed file system to store data,distributed anomaly detection experiments were performed using distributed computing technologies by MapReduce,Flume and Mahout,mining abnormal information and abnormal network traffic in data through clustering algorithm and classification algorithm,the time efficiency,accuracy rate,false rate and false alarm rate of distributed anomaly detection are analyzed.In summary,this study constructs the abnormal network traffic analysis experiment can effectively solve the network data collection,storage and abnormal analysis problems etc.It combines the advantages of Hadoop and data mining,it is taking advantage of the high scalability and high throughput of the Hadoop distributed computing framework,the data mining algorithm is used to detect the abnormal information in the network event deeply,low false alarm rate and false alarm rate,the detection accuracy is relatively high.
Keywords/Search Tags:Detection Of Network Traffic Anomalies, Hadoop, Log Analysis, FCM, Data-Mining, Flume, Intrusion Detection
PDF Full Text Request
Related items