Font Size: a A A

Research On High-speed Network Intrusion Detection Based On Imbalanced Data Sets Classification

Posted on:2011-07-26Degree:DoctorType:Dissertation
Country:ChinaCandidate:Y A ZhaoFull Text:PDF
GTID:1118360305471344Subject:Computer application technology
Abstract/Summary:PDF Full Text Request
With the rapid development of the network technologies and applications, more and more network attack techniques bring a serious challenge to the network security. How to make sure that the network intrusion detection system has the ability of real-time data analyzing network data packets and reducing false positives under the environment of the high-speed network traffic is becoming a very important problem.Thorough researches of IDS modules are performed, this paper proposed a novel Two-stage Strategy High-speed Intrusion Detection System Model Based Load Balancing. Based this model we proposed a new way of attack classification oriented Hierarchical Detection. As the number of intrusions on the network is typically a very small fraction of the total traffic. In the off-line phase, we use imbalanced data approach to High-speed network intrusion detection.The main contributions of the dissertation are as follows: 1. In order to solve the problem of efficiency of high-speed network intrusion detection, this paper presents a Two-stage Strategy High-speed Intrusion Detection System Model Based Load Balancing. There are two phases in the framework, in the on-line phase, the system captures the packets from network and split into small according the load balancing algorithm, the results of detection by each sensor should be handed over to the control and analysis host. In the off-line phase, we split the training dataset by the same algorithm from the on-line phase, and then build classification patterns for each sub sets. After create the patterns for intrusion, the modules outputs the patterns as the input of the corresponding sensor. We use the model of load balancing to improve the processing speed and use anomaly detection techniques to detected new attacks.2. In order to build eficient and realtime intrusion detection method.We proposed a way of attack classification oriented hierarchical detection. Based the TSMBLB model, It classified the attack by order of detecting time, by way of detecting hierarchically and taking task to each sensor, if can be detected in higher layer then unnecessary detected again in lower, to do this not only can ensure that there is no repetition of the classification, but also simplify the way of detection and improve the efficiency of detection.3. In order to solve the problem of low detection rate of the minority attack in network intrusion detection, we designed a classification frame for imbalanced data sets based the TSMBLB model. We employed the feature selection by the relief algorithm, the SMOTE over-sampling approaches to improve the number of attacks on the network, AdaBoost algorithm by using C4.5 or random forest as weak classifier are used to improve the detection of rare class, classification ability of the learning methods was measured with precision, recall and F-measure and ROC curves for classes from 10-fold cross-validation. Experiments have shown that the frame can reduce the time to build patterns dramatically and increase the detection rate of the minority intrusions.4. As the network data packets exist in a large number of "useless" and "noise" sample, we proposed a resampling method for learning from imbalanced datasets: Fast Hierarchical Nearest Neighbor.The basic idea of our method is divide the current set more or less equally into a few sets and resampling for each sub sets. Finally, we merger the resampling results to object set. For each sub sets before sampling we randomly selects a sample from each class into new structural set, for each example in the sub sets, the nearest neighbor are found, if misclassified, that element is moved from sub set to stuctural set. Experimental results show FHNN is very efficient in tacking noise and majority class examples and faster than other methods while takeing a linear order. In addition, when some new data or new attacks came, we use the method on the new dataset, and merge the resampling results with the older original training set S into new training dataset.
Keywords/Search Tags:high-speed network, intrusion detection, attack classification, imbalanced data sets, ensemble learning, resampling methods
PDF Full Text Request
Related items