Font Size: a A A

Web Intrusion Detection Based On Imbalanced Data Classification Method

Posted on:2019-02-20Degree:MasterType:Thesis
Country:ChinaCandidate:S W CaoFull Text:PDF
GTID:2428330551960315Subject:Computer technology
Abstract/Summary:PDF Full Text Request
The websites are becoming more and more popular in daily work.On the one hand,they bring convenience to our life;on the other hand,we also need to prevent all kinds of website intrusions.The IIS web logs record every visitor's behavior.They hide the telltale signs of illegal invasions.The analysis of suspicious behavior through log files has become an important part of the website intrusion detection.At present,researchers have applied data mining technology to it,and have made a lot of progress.From the daily visit of the website,it is obvious that the normal traffic is far more than the illegal traffic.So the IIS log is a typical imbalanced data set.How to separate the illegal access records from the mass records by the classification algorithm is the key.Therefore,the use of imbalanced data classification methods of web logs analysis,found illegal invaders in the running sites,and classification of projections for new visitors' identity.This is of great significance for improving the safety performance of the website,optimizing the network environment and ensuring the normal use of the website.This paper applies the imbalanced data classification method to intrusion detection.Logging is divided into two categories(illegal access records,normal access records).This paper makes a comparative analysis of the classification effect of different algorithms.And we designed and implemented an intrusion detection system based on IIS log.The main work of this paper is as follow:(1)According to the IIS log features and intrusion keywords,database technology is used to complete data attribute selection.In this paper 8 kinds of sampling algorithms(random under-sampling,SOMTE,Tomek links,K-Means,OSS,SOMTE+Tomek links,SOMTE+K-Means,SOMTE+ OSS)and 3 kinds of classification methods(C4.5,3-NN,Naive Bayes theorem)are combined to form various classification models and compare their classification effects to seek the optimal classification model.(2)This paper designs and implements an intrusion detection system based on IIS log.The system is divided into four functional modules: data acquisition layer,data processing layer,data analysis layer and data prediction layer.The system collects and preprocesses log data to from the required format and properties,and by combining various sampling and classification algorithms.Then analyzing and comparing their classification effects.Finally,the optimal classification model is selected to predict the new collection of log data.
Keywords/Search Tags:IIS, Imbalanced data, Intrusion detection, Sampling algorithm, Classification model
PDF Full Text Request
Related items