Font Size: a A A

An Intrusion Detection Method Based On Generative Adversarial Networks And Ensemble Learning

Posted on:2023-10-14Degree:MasterType:Thesis
Country:ChinaCandidate:X D LiuFull Text:PDF
GTID:2568307100975209Subject:Cyberspace security
Abstract/Summary:PDF Full Text Request
Intrusion detection is an essential tool for identifying network attacks.It can effectively reduce the risk of network security threats by analyzing traffic characteristics to classify network traffic into normal behavior and attack behavior.In recent years,researchers have introduced machine learning into intrusion detection.However,the problem of data imbalance often occurs in intrusion detection,i.e.,the number of normal behavior samples is much higher than that of attack behavior samples.This leads to machine learning identification results that are biased toward the majority class,which affects the detection effectiveness.In this thesis,through the investigation of machine learning-based intrusion detection methods,we find that data imbalance has not received sufficient attention in the existing research.The high dimensionality and complexity of network traffic data make it challenging to solve the data imbalance problem in intrusion detection effectively.To address the above issues,this thesis reduces the impact of data imbalance on intrusion detection at both data and algorithm levels.At the data level,this thesis uses data oversampling methods to deal with imbalanced datasets,aiming to improve the number and diversity of a few classes of attack samples.At the algorithmic level,this thesis constructs intrusion detection models based on ensemble learning algorithms to further reduce the risk of false predictions.The main contributions of this paper are the following.:(1)In this paper,we use Systematic Literature Review(SLR)to conduct a comprehensive technical analysis of machine learning-based intrusion detection related research in the last decade.Based on a rigorous technical analysis process,119 highly cited articles were selected from 14942 related studies.According to the main contents involved in intrusion detection research,this paper provides a comprehensive summary and analysis of the current research status in four aspects: data pre-processing,intrusion detection algorithms,evaluation criteria and datasets.The related results provide support for the subsequent work in this paper.(2)To improve the number and diversity of minority attack samples,this thesis proposes an oversampling data method based on Gradient Penalty Wasserstein Generative Adversarial Networks(WGAN-GP).The method uses WGAN-GP to construct a generative model for learning the distribution of attack samples and generating simulated attack samples.The method identifies and eliminates possible noisy samples based on sample nearest neighbor information to avoid introducing new noise.Also,to reduce irrelevant and redundant features of the data,the method uses Analysis of Variance(ANOVA)and backward sequence search for feature selection of the data.(3)To further reduce the impact of data imbalance on intrusion detection,this paper proposes a new intrusion detection method based on data oversampling and e Xtreme Gradient Boosting(XGBoost).For the binary classification task,this thesis uses the proposed data oversampling method to process the dataset to improve the dataset’s imbalance.To further reduce the risk of false prediction,this thesis constructs an intrusion detection model based on XGBoost.This thesis uses the proposed data oversampling method to process the dataset for the multiclassification task.It then uses a one-to-one decomposition strategy to decompose the multiclassification task into several binary sub-tasks,aiming to reduce the complexity of the multiclassification task.For each binary sub-task,the intrusion detection model is constructed separately based on XGBoost.Finally,the final detection results are obtained by aggregating the results of each binary classification.(4)In this paper,we design a series of experiments based on the NSL-KDD dataset and the UNSW-NB15 dataset to verify the effectiveness of the proposed method.The experimental results show that the data oversampling method proposed in this paper can effectively improve the performance of intrusion detection.Compared with the optimal baseline method,the proposed data oversampling method can improve the attack detection rate by about 4% on average and reduce the false alarm rate by about 20% on average.Also,the experimental results show that the intrusion detection method proposed in this paper outperforms the baseline method by improving the accuracy and F1 by about 3%.
Keywords/Search Tags:network intrusion detection, data imbalance, generative adversarial networks, ensemble learning
PDF Full Text Request
Related items