| The openness and sharing of computer networks promote the advent of the digital age,while the network is faced with increasing intrusion behavior,seriously threatening the security of users' information and property.An intrusion detection system(IDS)is an important network security defense method and has been widely studied and concerned.Deep forest(gc Forest)is a new machine learning method,which has good application potential for IDS.However,there are some defects in gc Forest,such as structural redundancy,high memory requirement,and insensitivity to class imbalance.In this thesis,we improve these defects then discuss the application of the improved gc Forest in IDS through experiments.The main work content is as follows:(1)An intrusion detection method based on cascade XGBoost(gc XGBoost)is proposed.Gc Forest introduces extreme random forest in the cascade structure to prevent overfitting,which increases the complexity and calculation cost of the model.XGBoost adds regularization terms with the loss function to limit the complexity of the model and performs a second-order Taylor expansion of the loss function so that the introduction of XGBoost into the cascade structure can reduce the complexity of the model while ensuring precision.(2)An intrusion detection feature selection method based on Ant Colony Optimization(ACO)algorithm integrated with XGBoost(XGB-ACO)is proposed.Gc Forest uses multi-grained scanning to enhance the cascading structure.However,multi-grained scanning consumes a lot of memory and time,and multi-grained scanning can be regarded as a feature selection method.Therefore,to reduce memory consumption and improve the efficiency of IDS based on gc Forest,this thesis employs XGB-ACO for intrusion detection feature selection.(3)An intrusion detection method based on cascade Balanced Bagging(GCBalanced Bagging)is proposed.After the improvements of cascade structure and feature selection,the macro F1-score of IDS based on gc Forest is already good,but the F1-score of minority classes is still low because there is a class imbalance in intrusion detection.Therefore,Balanced Bagging was introduced into the cascade structure in this thesis to improve the F1-score of minority classes in IDS based on gc Forest.Balanced Bagging enables the base learner in Bagging to receive the balanced data set by performing a balanced sampling of the data set.In conclusion,this thesis applies some public data sets and KDD-NSL to verify proposed methods.And,the experimental results verify the reliability of the methods proposed in this thesis. |