Font Size: a A A

Research On Network Intrusion Detection Method Based On Semi-supervised Ensemble Learning

Posted on:2022-04-22Degree:MasterType:Thesis
Country:ChinaCandidate:H ZhouFull Text:PDF
GTID:2518306539962419Subject:Computer technology
Abstract/Summary:PDF Full Text Request
With the continuous development of information technology,network security problems emerge endlessly,and the ways of network attacks become complex and diverse.Network intrusion detection system has the function of active defense,which can continuously monitor the network traffic and improve the security of the system.Nowadays with the popularization of artificial intelligence technology and the improvement of computer hardware,a large number of artificial intelligence technologies are applied in the field of network intrusion detection.in a large number of network data,intrusion behavior accounts for only a few,data distribution is not balanced,and there is a large amount of redundant data in the original network traffic.These problems lead to the low accuracy of intrusion behavior identification of the model.The work of this paper is as follows:(1)This paper introduces the background knowledge and practical significance of an intrusion detection system in detail and introduces the current situation of intrusion detection methods based on traditional rule matching,statistical knowledge and machine learning at home and abroad.This paper describes the intrusion detection system,data preprocessing and machine learning-related knowledge.(2)Aiming at the problem of data imbalance in intrusion detection data,a data enhancement method based on a generative adversarial network is studied.This method learning the data features of rare data and enhances the rare data in the intrusion detection data set.After data enhancement,the feature dimension of the data set is reduced by using an auto-encoder network,which shortens the training time of the model and makes the neural network converge quickly.Finally,the contrast experiment and binary classification experiment before and after data enhancement are designed on the NSL-KDD data set.(3)Aiming at the problems of low accuracy and poor generalization of a single machine learning,an ensemble learning model design method based on Bagging is proposed.It uses autonomous random sampling methods to sample multiple data subsets and then uses the segmented multiple subsets to train the base classifier,the optimal model is obtained by the ensemble learning combination strategy of weighted voting,which avoids the dilemma that a single machine learning model is easy to fall into the local optimal solution.two classification experiments and five classification experiments are designed on NSL-KDD data set.The innovations in this paper include the following:(1)An improved intrusion detection data enhancement scheme based on a generative adversarial network is proposed.This method changes the generator network into a long short-term memory network based on the research of intrusion detection data set.After the training,it can better generate the rare data of intrusion detection,to achieve the effect of keeping the balance of all kinds of data in the data set.At the same time,the improved auto-encoder network is used for feature extraction to speed up the convergence speed of the model.Through experiments,3000 pieces of data are added to a few classes in the data set,and the accuracy rate is 0.96%higher after data enhancement on the RF model.The method combining auto-encoder network and data enhancement algorithm based on generative adversarial network is 3.06%higher than RNN model in KDDTest+test set and 8.76%higher than the LDA-CNN model in KDDTest-21test set.(2)An ensemble learning model design method based on the bagging is proposed.In order to improve the accuracy of the model,bagging method is used to ensemble multiple base classifiers.The Bi-directional Long-Short Term Memory model is given higher weight to enhance the generalization of the model.Compared with other experiments,in the same benchmark data set,the accuracy of this method is 5.04%and 11.26%higher than RNN model in two test sets of two classifications,and 0.94%and 1.52%higher than RNN model in two test sets of five classifications.
Keywords/Search Tags:Intrusion Detection, Data imbalance, Semi-supervised Learning, Ensemble Learning, Machine Learning
PDF Full Text Request
Related items