Font Size: a A A

Intrusion Detection Research Based On Message Flow

Posted on:2024-03-15Degree:MasterType:Thesis
Country:ChinaCandidate:Z ShengFull Text:PDF
GTID:2558307094974499Subject:Computer technology
Abstract/Summary:PDF Full Text Request
With the popularity of the Internet,it has become an indispensable part of our daily life,so it has become increasingly important to protect information security.By applying machine learning technology to intrusion detection,the operation of data and hosts in the network can be effectively monitored,and abnormal behaviors can be detected and effective measures can be taken in a timely manner,thus securing information security.In addition,the accuracy and efficiency of detection can be effectively improved by building classification detection models through training datasets.When constructing the model,there are more redundant data and higher feature dimensions in the dataset,which can affect the detection effect of the model,and there are also differences in the classification performance between the majority and minority samples due to the unbalanced data categories.To address the above problems,this paper proposes a random forest-based tree model integration learning algorithm.This paper first addresses the problem of low detection accuracy of traditional machine learning in classification model construction,proposes an integrated learning algorithm based on tree model,and evaluates the results of comparing traditional learning with this algorithm starting from an unbalanced data set.By applying the tree-based model to the Stacking integrated learning method based on its characteristics of high robustness and high interpretability,the performance of each tree model is improved to enhance the robustness of intrusion detection and the detection performance of each algorithm.The results show that the F1-Score of Tree-Stacking is 95.9% and the AUC score is 97.29%,while the time consumption is only 33.41 s,which is the highest score and the lowest time consumption compared with other algorithms,with good performance and high real-time performance.Secondly,to address the problem that there are a large number of redundant features in the original data and the detection performance of each classification category of the traditional intrusion detection model is insufficient,this paper proposes a random forestbased intrusion detection model with integrated tree learning and gives the corresponding RF-Tree-Stacking algorithm.The Kmeans clustering algorithm is used to cluster similar cluster class data for redundancy reduction,and the optimal data compression is achieved after obtaining feature importance scores and ranking according to the oob scores by random forest,while the Smote algorithm is used to balance the dataset with a small number of samples to improve the detection capability of few samples and obtain a better feature subset.The detection performance of the model is also evaluated using the publicly available pair of CICIDS2017 datasets.Specifically,three aspects are verified:(1)Using the proposed RF-Kmeans-Smote method compared with the traditional machine learning method,the improvement in Accuracy,Precision,Recall,F1-Score,and AUC is 13.79%,25.49%,16.57%,29.5%,and 7.16%,respectively.The time consumption is significantly reduced to verify the feasibility of the method in machine learning.(2)RF-Tree-Stacking vs.Tree-Stacking.from the overall evaluation,although the F1-Score in Accuracy decreases by 0.08%,the F1-Score improves by 3.27%;from the classification categories,although the F1-Score in Normal decreases by 0.16%,the F1-Score in a few categories improved by 0.17%~19.71%,verifying the feasibility of the improvement.(3)The proposed RF-Keamns-Smote algorithm improves Accuracy,Precision,Recall,F1-Score,and AUC by 25.83%,46.66%,30.76%,and 7.73%,respectively,in detection performance performance compared with traditional machine learning methods,verifying the feasibility of the proposed RF-Tree-Stacking algorithm is feasible.Therefore,the proposed RF-Tree-Stacking can achieve better results in intrusion detection.
Keywords/Search Tags:Intrusion detection, feature selection, random forest, ensemble learning
PDF Full Text Request
Related items