Design And Implementation Of Network Malicious Behavior Analysis System Based On Massive WEB Logs

Posted on:2016-04-01

Degree:Master

Type:Thesis

Country:China

Candidate:A L Xu

Full Text:PDF

GTID:2308330482951648

Subject:Software engineering

Abstract/Summary:

PDF Full Text Request

With the rapid development of computer technology and Internet, a variety of WEB-based network applications have sprung up, and the number of WEB users are also increased rapidly. However, various WEB applications not only bring convenience to people’s study, work and life, but also bring serious threats to people’s information security. Due to the widely use of WEB applications, Trojans, botnets, APT activities often use them to penetrate or intrude networks and control victims, which are serious threats to the security of information and wealth of network users. How to analyze suspicious malicious network behaviors from mass WEB log is a key research point.Currently, there are several challenges for building a WEB log mining system for massive data and apply it in the real world network security practice. Firstly, items in WEB logs are very complex, namely that there are big differences in format, field title, and uniformity among logs from distinct web sites and sources, making it difficult to process them universally. Secondly, URL, the specific visiting path of a web site, is very important in WEB logs. How to design a URL inspection module to accurately and timely detect the malicious links, SQL injections and XSS script is worth studying. Finally, analysis or mining is the last issue for massive log processing. How to build a practical platform for data analysis, and design proper algorithms to dig out the suspicious malicious behaviors is the key problem to solve.Given the above issues, we design and implement a full WEB log mining system for malicious behaviors detection in real world. Massive WEB logs are obtained from an Internet content provider (ICP) by cooperation firstly. Then, we build a protosystem to analyze and discover the hidden malicious users and behaviors.The main contributions of this paper are as follows:(1) We design and implement a WEB log preprocessing module in order to deal with the different log format, and filter the improper and redundant logs. WEB log preprocessing module is divided into three parts, namely the data cleaning, user identification and session identification.(2) In URL inspection, we design and implement a URL inspection module. The open source project libinjection is used in our system for SQL injection and cross-site scripting XSS detection. Meanwhile, we collect open URL data sets which are utilized on malicious link scanning.(3) As for the massive log analysis and mining, the high performance Spark platform is used in our system to measure WEB sessions, do statistics and correlation analysis, in order to detect hidden malicious behaviors. First, we measure logging interval adjacent WEB distribution of the same users, which can determine the timeout of different sessions of same user. Then we focus on each user, Cip (client ip), Cip24-bit or 16-bit mask respectively, statistical and correlation analysis methods are used to identify the relationships between the user with geographical feature, time, and pages. Final the malicious users and malicious behaviors are discovered by comprehensive calculation.

Keywords/Search Tags:

WEB log, data mining, malicious behavior, URL inspection, Spark

PDF Full Text Request

Related items

1	The Design And Implementation Of User Behavior Analysis System Based On Spark
2	Analyze And Research Of E-commence User Behavior Based On Spark
3	Research On Detection Methods Of Abnormal Behavior In Security For IIoT Based On Random Inspection
4	The Design And Implementation Of Mobile Users' Behavior Data Query And Analysis System Based On Spark
5	The Research Of Network User Behavior Analysis Method Based On Spark
6	Research Of Big Data Analysis Of E-commerce User Behavior Based On Spark
7	Design And Implementation Of Real Time Detection System For Malicious Domain Based On Spark Framework
8	A Frequent Serial Episode Mining Algorithm With Time Constraints Based On Spark Platform
9	Research Of Large-scale Data Mining Technology Based On Spark
10	Research On Data Mining Technology Based On Spark