Font Size: a A A

Design And Implementation Of Network Malicious Behavior Analysis System Based On Massive WEB Logs

Posted on:2016-04-01Degree:MasterType:Thesis
Country:ChinaCandidate:A L XuFull Text:PDF
GTID:2308330482951648Subject:Software engineering
Abstract/Summary:PDF Full Text Request
With the rapid development of computer technology and Internet, a variety of WEB-based network applications have sprung up, and the number of WEB users are also increased rapidly. However, various WEB applications not only bring convenience to people’s study, work and life, but also bring serious threats to people’s information security. Due to the widely use of WEB applications, Trojans, botnets, APT activities often use them to penetrate or intrude networks and control victims, which are serious threats to the security of information and wealth of network users. How to analyze suspicious malicious network behaviors from mass WEB log is a key research point.Currently, there are several challenges for building a WEB log mining system for massive data and apply it in the real world network security practice. Firstly, items in WEB logs are very complex, namely that there are big differences in format, field title, and uniformity among logs from distinct web sites and sources, making it difficult to process them universally. Secondly, URL, the specific visiting path of a web site, is very important in WEB logs. How to design a URL inspection module to accurately and timely detect the malicious links, SQL injections and XSS script is worth studying. Finally, analysis or mining is the last issue for massive log processing. How to build a practical platform for data analysis, and design proper algorithms to dig out the suspicious malicious behaviors is the key problem to solve.Given the above issues, we design and implement a full WEB log mining system for malicious behaviors detection in real world. Massive WEB logs are obtained from an Internet content provider (ICP) by cooperation firstly. Then, we build a protosystem to analyze and discover the hidden malicious users and behaviors.The main contributions of this paper are as follows:(1) We design and implement a WEB log preprocessing module in order to deal with the different log format, and filter the improper and redundant logs. WEB log preprocessing module is divided into three parts, namely the data cleaning, user identification and session identification.(2) In URL inspection, we design and implement a URL inspection module. The open source project libinjection is used in our system for SQL injection and cross-site scripting XSS detection. Meanwhile, we collect open URL data sets which are utilized on malicious link scanning.(3) As for the massive log analysis and mining, the high performance Spark platform is used in our system to measure WEB sessions, do statistics and correlation analysis, in order to detect hidden malicious behaviors. First, we measure logging interval adjacent WEB distribution of the same users, which can determine the timeout of different sessions of same user. Then we focus on each user, Cip (client ip), Cip24-bit or 16-bit mask respectively, statistical and correlation analysis methods are used to identify the relationships between the user with geographical feature, time, and pages. Final the malicious users and malicious behaviors are discovered by comprehensive calculation.
Keywords/Search Tags:WEB log, data mining, malicious behavior, URL inspection, Spark
PDF Full Text Request
Related items