With the rapid development of the Internet in recent years,various forms of network attacks have emerged one after another,so how to detect abnormal behavior effectively and their attack categories has become an important resear-ch subject in the field of cyber security.In response to inadequate accuracy and feature extraction ability of the existing intrusion detection technology,this thesis proposes a network intrusion detection system based on Stacking technology,which combines the advantages of deep Auto-Encoder and traditional machine learning.The design of the intrusion detection system includes mainly five steps.In the fir-st step,a tool based on the libpcap dynamic library is designed to get data.In the second step,the obtained data is preprocessed.In the third step,a deep Auto-Encoder is used for extracting new features from the data.In the fourth step,the new features captured by the deep Auto-Encoder are concatenated with the original features to train each machine learning model.Finally,the results of several machine learning models are combined to produce the final output by Stacking technology.Therefore,the main work is as follows:1.A network protocol parser is designed.It is mainly used to capture network traffic and parse it into fixed format data.First,the meaning of each feature in the data set is analyzed.Then,the data link layer,the network layer and the transport layer of the honeypot system are parsed on the basis of the TCP/IP protocol stack.Finally,the required features are counted.After this process,high-quality data is obtained.2.The high-order features of the dataset are extracted by a deep Auto-Encoder.Its aim is to extract higher-order nonlinear features.A 2-layer deep Auto-Encoder is used for mining high-order features,and then the results are concatenated with the clean data.The features are increased and the accuracy is increased by 0.6%after this operation.So deep Auto-Encoder can increase the accuracy of the model.3.The models are combined based on the Stacking.It is used for improving the stability and accuracy of the model.One random forest model and two sets of LightGBM models with different parameter settings are selected as the first layer model,and logistic regression is chosen as the second layer model to construct a 2-layer Stacking framework.At the same time,for the characteristics of data imbalance,macro F1 is selected as the standard for model evaluation in the training phase.The result shows that the accuracy is increased by 1.1%,so proper Stacking technology can increase the accuracy of the model.This thesis uses the KDD99 dataset that is the benchmark dataset for intrusion detection systems as the data of the experiment to analyze results and compare performance with existing intrusion detection technology.The experimental results show that the proposed model achieves 95.8%accuracy.Therefore,this intrusion detection system based on Stacking technology can effectively detect intrusion behavior. |