Font Size: a A A

The Design And Implementation Of A Malicious Network Traffic Analysis System Based On Big Data

Posted on:2016-10-01Degree:MasterType:Thesis
Country:ChinaCandidate:X C LiuFull Text:PDF
GTID:2298330467492548Subject:Signal and Information Processing
Abstract/Summary:PDF Full Text Request
This thesis studies the analysis and detection of malicious network traffic in the big data era. This thesis presents the design and implementation of a Network Behavior Analysis and Surveillance System. The whole system is divided into acquisition, honeypot system, Hadoop processing platform and presence server. Probe module is responsible for collecting data from the outlets of the hundreds of mid-sized enterprises’ network; all packets were uploaded to the collection server after the pre-processing of internal probe. The Hadoop platform which play the role of data integration and data analysis platform will have to download the data daily in time to ease the collection servers’ data storage pressure. The downloaded data were saved in HDFS in fixed format. A honeypot network is mainly responsible for collecting a variety of Trojans, bots in the network, and extracting their features and sending them to the Hadoop platform for following analysis. And the processing and analysis of the results will be presented in the form of a chart. The honeypot system which built in this thesis is a closed structure; it was responsible for attracting a variety of malicious network traffic, while the suspicious URL algorithm proposed in this thesis helped us get the suspicious URL list.This thesis also presents a method to detect malicious network traffic. All the work with the closed-loop honeypot system was integrated into a Network Behavior Analysis and Surveillance System that can effectively and accurately detect abnormal behavior. Data collection algorithm mentioned in this thesis based on compound session solves the of probes’ memory limit. The compound session is a complex entity that is uniquely identified by the group consisting of src, dst, network protocol and destination port. In order to eliminate the negative impact of data collection, we also proposed an algorithm for data processing based on MapReduce. Finally, through a three-step algorithm:data filtering, domain name matching and network nodes excluded. Firstly, identify the cyclical behavior of network traffic, and then get the final test results by comparing the white list and other means to remove some misjudge objects.
Keywords/Search Tags:Big Data, Malicious Traffic, Botnet, Hadoop, Compound Session
PDF Full Text Request
Related items